Exploring Topic Modelling of User Reviews as a Monitoring Mechanism for Emergent Issues Within Social VR Communities
Angelo Singh, Joseph O'Hagan
TL;DR
This study tackles the challenge of monitoring emergent harassment in social VR by applying natural language processing to a large corpus of Rec Room reviews from Steam (roughly 40,000 items). It implements a multi-step pipeline: sentiment analysis with supervised labeling and pretrained models, followed by term-frequency investigations and BERTopic-based clustering to extract eight high-level harassment themes. The results show a downward trend in average sentiment over time and reveal recurring issues such as toxic communities, child-related harassment, racism, and moderation/technical problems. The work demonstrates that scalable, platform-agnostic monitoring of social VR communities is feasible using publicly available review data and motivates richer data collection and reporting mechanisms to support ongoing safety research and developer interventions.
Abstract
Users of social virtual reality (VR) platforms often use user reviews to document incidents of witnessed and/or experienced user harassment. However, at present, research has yet to be explore utilising this data as a monitoring mechanism to identify emergent issues within social VR communities. Such a system would be of much benefit to developers and researchers as it would enable the automatic identification of emergent issues as they occur, provide a means of longitudinally analysing harassment, and reduce the reliance on alternative, high cost, monitoring methodologies, e.g. observation or interview studies. To contribute towards the development of such a system, we collected approximately 40,000 Rec Room user reviews from the Steam storefront. We then analysed our dataset's sentiment, word/term frequencies, and conducted a topic modelling analysis of the negative reviews detected in our dataset. We report our approach was capable of longitudinally monitoring changes in review sentiment and identifying high level themes related to types of harassment known to occur in social VR platforms.
