Table of Contents
Fetching ...

tensorFM: Low-Rank Approximations of Cross-Order Feature Interactions

Alessio Mazzetto, Mohammad Mahdi Khalili, Laura Fee Nern, Michael Viderman, Alex Shtoff, Krzysztof Dembczyński

TL;DR

tensorFM introduces a low-rank, CP-decomposed tensor framework to model higher-order cross-field interactions in multi-field categorical data. By constraining interaction tensors to have CP rank, it achieves fast, scalable inference with a bound on complexity on the order of $O(nkrd^2)$ while enabling learning of higher-order effects beyond traditional FwFM. Empirical results across online advertising benchmarks, COMPAS, and synthetic data show competitive accuracy and notably favorable latency, with interpretability demonstrated via correlations to mutual information and clear visualization of influential field interactions. The approach offers a practical, transparent alternative to deep models for real-time prediction tasks and opens avenues for combining with neural components in future work. Overall, tensorFM provides a principled and scalable method to capture and interpret cross-order feature interactions in large, sparse tabular datasets.

Abstract

We address prediction problems on tabular categorical data, where each instance is defined by multiple categorical attributes, each taking values from a finite set. These attributes are often referred to as fields, and their categorical values as features. Such problems frequently arise in practical applications, including click-through rate prediction and social sciences. We introduce and analyze {tensorFM}, a new model that efficiently captures high-order interactions between attributes via a low-rank tensor approximation representing the strength of these interactions. Our model generalizes field-weighted factorization machines. Empirically, tensorFM demonstrates competitive performance with state-of-the-art methods. Additionally, its low latency makes it well-suited for time-sensitive applications, such as online advertising.

tensorFM: Low-Rank Approximations of Cross-Order Feature Interactions

TL;DR

tensorFM introduces a low-rank, CP-decomposed tensor framework to model higher-order cross-field interactions in multi-field categorical data. By constraining interaction tensors to have CP rank, it achieves fast, scalable inference with a bound on complexity on the order of while enabling learning of higher-order effects beyond traditional FwFM. Empirical results across online advertising benchmarks, COMPAS, and synthetic data show competitive accuracy and notably favorable latency, with interpretability demonstrated via correlations to mutual information and clear visualization of influential field interactions. The approach offers a practical, transparent alternative to deep models for real-time prediction tasks and opens avenues for combining with neural components in future work. Overall, tensorFM provides a principled and scalable method to capture and interpret cross-order feature interactions in large, sparse tabular datasets.

Abstract

We address prediction problems on tabular categorical data, where each instance is defined by multiple categorical attributes, each taking values from a finite set. These attributes are often referred to as fields, and their categorical values as features. Such problems frequently arise in practical applications, including click-through rate prediction and social sciences. We introduce and analyze {tensorFM}, a new model that efficiently captures high-order interactions between attributes via a low-rank tensor approximation representing the strength of these interactions. Our model generalizes field-weighted factorization machines. Empirically, tensorFM demonstrates competitive performance with state-of-the-art methods. Additionally, its low latency makes it well-suited for time-sensitive applications, such as online advertising.
Paper Structure (17 sections, 4 theorems, 22 equations, 5 figures, 7 tables)

This paper contains 17 sections, 4 theorems, 22 equations, 5 figures, 7 tables.

Key Result

Proposition 1

Let ${\bm{S}} \in \mathbb{R}^{n \times n}$ be a rank $r$ matrix. Then, after an appropriate preprocessing step depending only on ${\bm{S}}$, it is possible to evaluate $\langle {\bm{A}}_{\bm{x}}^T {\bm{A}}_{\bm{x}}, {\bm{S}} \rangle_F$ in time $O(rnk)$ for any $\bm{x} \in \mathcal{D}_{m,n}$.

Figures (5)

  • Figure 1: Test AUC as a function of rank $r$. Increasing rank and order of interaction improves the performance.
  • Figure 2: Loss as a function of rank $r$. Increasing rank and order of interaction decreases the loss.
  • Figure 3: Inference time of different models measured in FLOPs varying the number of input fields.
  • Figure 4: Overlap of top-$k$ interactions: tensorFM(3,3) vs mutual information. The dashed line represents the expected overlap with a random order.
  • Figure 5: Heatmap of the 36 strongest learned third-order interaction strengths aggregated across field triplets.

Theorems & Definitions (4)

  • Proposition 1
  • Lemma 2
  • Theorem 3
  • Lemma 4