Polyhedral Conic Classifier for CTR Prediction
Beyza Turkmen, Ramazan Tarik Turksoy, Hasan Saribas, Hakan Cevikalp
TL;DR
This paper tackles CTR prediction under severe class imbalance and geometric asymmetry by introducing a Deep Compact Polyhedral Conic Classifier (DCPCC) that yields bounded, convex positive regions through polyhedral conic functions. The method integrates into standard deep CTR architectures via embedding, interaction/MLP, and a specially designed output layer, using center loss to estimate the cone vertex and a BCE-based objective with a compactness term. Empirical results on four public datasets (Criteo, Avazu, MovieLens, Frappe) show consistent improvements over BCE-based baselines on most models and datasets, highlighting the approach's robustness to imbalance and distributional diversity. The work offers a practical, plug-in improvement for industrial recommender systems that require tight positive-region modeling and resilience to noisy, diverse negatives.
Abstract
This paper introduces a novel approach for click-through rate (CTR) prediction within industrial recommender systems, addressing the inherent challenges of numerical imbalance and geometric asymmetry. These challenges stem from imbalanced datasets, where positive (click) instances occur less frequently than negatives (non-clicks), and geometrically asymmetric distributions, where positive samples exhibit visually coherent patterns while negatives demonstrate greater diversity. To address these challenges, we have used a deep neural network classifier that uses the polyhedral conic functions. This classifier is similar to the one-class classifiers in spirit and it returns compact polyhedral acceptance regions to separate the positive class samples from the negative samples that have diverse distributions. Extensive experiments have been conducted to test the proposed approach using state-of-the-art (SOTA) CTR prediction models on four public datasets, namely Criteo, Avazu, MovieLens and Frappe. The experimental evaluations highlight the superiority of our proposed approach over Binary Cross Entropy (BCE) Loss, which is widely used in CTR prediction tasks.
