tensorFM: Low-Rank Approximations of Cross-Order Feature Interactions
Alessio Mazzetto, Mohammad Mahdi Khalili, Laura Fee Nern, Michael Viderman, Alex Shtoff, Krzysztof Dembczyński
TL;DR
tensorFM introduces a low-rank, CP-decomposed tensor framework to model higher-order cross-field interactions in multi-field categorical data. By constraining interaction tensors to have CP rank, it achieves fast, scalable inference with a bound on complexity on the order of $O(nkrd^2)$ while enabling learning of higher-order effects beyond traditional FwFM. Empirical results across online advertising benchmarks, COMPAS, and synthetic data show competitive accuracy and notably favorable latency, with interpretability demonstrated via correlations to mutual information and clear visualization of influential field interactions. The approach offers a practical, transparent alternative to deep models for real-time prediction tasks and opens avenues for combining with neural components in future work. Overall, tensorFM provides a principled and scalable method to capture and interpret cross-order feature interactions in large, sparse tabular datasets.
Abstract
We address prediction problems on tabular categorical data, where each instance is defined by multiple categorical attributes, each taking values from a finite set. These attributes are often referred to as fields, and their categorical values as features. Such problems frequently arise in practical applications, including click-through rate prediction and social sciences. We introduce and analyze {tensorFM}, a new model that efficiently captures high-order interactions between attributes via a low-rank tensor approximation representing the strength of these interactions. Our model generalizes field-weighted factorization machines. Empirically, tensorFM demonstrates competitive performance with state-of-the-art methods. Additionally, its low latency makes it well-suited for time-sensitive applications, such as online advertising.
