Reputation Gaming in Stack Overflow
Iren Mazloomzadeh, Gias Uddin, Foutse Khomh, Ashkan Sami
TL;DR
The paper addresses reputation gaming in Stack Overflow by first qualitatively characterizing manipulation scenarios from 1,697 meta posts and then designing two detectors toAutomatically identify suspicious reputations: Algorithm 1 targets suspicious communities via interaction graphs and Louvain modularity, while Algorithm 2 flags individual users with abrupt reputation gains. Empirical evaluation against Stack Overflow reputation histories shows these detectors outperform two baselines and reveal distinct fraud patterns, including both up-increase and down-impact strategies. The authors also engage with Stack Overflow moderators, obtaining validation and actionable feedback that underscores the need for complementary detection approaches. Overall, the work highlights risks to research relying on Stack Overflow data and offers concrete methods for platform designers to mitigate manipulation and preserve trust in crowd-sourced developer knowledge.
Abstract
Stack Overflow incentive system awards users with reputation scores to ensure quality. The decentralized nature of the forum may make the incentive system prone to manipulation. This paper offers, for the first time, a comprehensive study of the reported types of reputation manipulation scenarios that might be exercised in Stack Overflow and the prevalence of such reputation gamers by a qualitative study of 1,697 posts from meta Stack Exchange sites. We found four different types of reputation fraud scenarios, such as voting rings where communities form to upvote each other repeatedly on similar posts. We developed algorithms that enable platform managers to automatically identify these suspicious reputation gaming scenarios for review. The first algorithm identifies isolated/semi-isolated communities where probable reputation frauds may occur mostly by collaborating with each other. The second algorithm looks for sudden unusual big jumps in the reputation scores of users. We evaluated the performance of our algorithms by examining the reputation history dashboard of Stack Overflow users from the Stack Overflow website. We observed that around 60-80% of users flagged as suspicious by our algorithms experienced reductions in their reputation scores by Stack Overflow.
