Towards a performance characteristic curve for model evaluation: an application in information diffusion prediction
Wenjin Xie, Xiaomeng Wang, Radosław Michalski, Tao Jia
TL;DR
The paper addresses the challenge of comparing information diffusion prediction models across datasets with differing randomness and complexity. It introduces Average Pairwise Comparison Entropy ($APCE$) to quantify diffusion-data randomness and develops a scaled accuracy metric ($SMAP$) that, when plotted against $APCE$, yields an exponential performance curve characterizing a model's inherent predictive capability under uncertainty. The authors validate the approach on a family of diffusion models and apply it to eight state-of-the-art methods, plus a case study, showing that the curve captures nuanced differences that static single-point metrics miss. This framework provides a systematic, cross-task evaluation tool with potential applicability beyond information diffusion to other sequence-prediction tasks.
Abstract
The information diffusion prediction on social networks aims to predict future recipients of a message, with practical applications in marketing and social media. While different prediction models all claim to perform well, general frameworks for performance evaluation remain limited. Here, we aim to identify a performance characteristic curve for a model, which captures its performance on tasks of different complexity. We propose a metric based on information entropy to quantify the randomness in diffusion data. We then identify a scaling pattern between the randomness and the prediction accuracy of the model. By properly adjusting the variables, data points by different sequence lengths, system sizes, and randomness can all collapse into a single curve. The curve captures a model's inherent capability of making correct predictions against increased uncertainty, which we regard as the performance characteristic curve of the model. The validity of the curve is tested by three prediction models in the same family, reaching conclusions in line with existing studies. In addition, we apply the curve to successfully assess the performance of eight state-of-the-art models, providing a clear and comprehensive evaluation even for models that are challenging to differentiate with conventional metrics. Our work reveals a pattern underlying the data randomness and prediction accuracy. The performance characteristic curve provides a new way to evaluate models' performance systematically, and sheds light on future studies on other frameworks for model evaluation.
