Table of Contents
Fetching ...

Agent for User: Testing Multi-User Interactive Features in TikTok

Sidong Feng, Changhao Du, Huaxiao Liu, Qingnan Wang, Zhengwei Lv, Gang Huo, Xu Yang, Chunyang Chen

TL;DR

The paper tackles the challenge of testing multi-user interactive features in TikTok by introducing a multi-agent LLM-based framework that uses a virtual device farm and dedicated agent roles to simulate collaborative user interactions. It demonstrates that the approach achieves high task coverage (75%) and strong action alignment (85.9%) while delivering substantial developer time savings (87%), and it can identify real-world bugs (26 found) when integrated into a production testing platform. The core contributions include the device-farm design, the Coordinator/Operator/Observer agent roles, an action-space and GUI representation tailored for LLMs, and a rigorous evaluation across both controlled tasks and industrial deployment. The work has practical implications for cost-effective, scalable testing of multi-user features in social apps and lays groundwork for cross-app extensions and localized LLM deployments.

Abstract

TikTok, a widely-used social media app boasting over a billion monthly active users, requires effective app quality assurance for its intricate features. Feature testing is crucial in achieving this goal. However, the multi-user interactive features within the app, such as live streaming, voice calls, etc., pose significant challenges for developers, who must handle simultaneous device management and user interaction coordination. To address this, we introduce a novel multi-agent approach, powered by the Large Language Models (LLMs), to automate the testing of multi-user interactive app features. In detail, we build a virtual device farm that allocates the necessary number of devices for a given multi-user interactive task. For each device, we deploy an LLM-based agent that simulates a user, thereby mimicking user interactions to collaboratively automate the testing process. The evaluations on 24 multi-user interactive tasks within the TikTok app, showcase its capability to cover 75% of tasks with 85.9% action similarity and offer 87% time savings for developers. Additionally, we have also integrated our approach into the real-world TikTok testing platform, aiding in the detection of 26 multi-user interactive bugs.

Agent for User: Testing Multi-User Interactive Features in TikTok

TL;DR

The paper tackles the challenge of testing multi-user interactive features in TikTok by introducing a multi-agent LLM-based framework that uses a virtual device farm and dedicated agent roles to simulate collaborative user interactions. It demonstrates that the approach achieves high task coverage (75%) and strong action alignment (85.9%) while delivering substantial developer time savings (87%), and it can identify real-world bugs (26 found) when integrated into a production testing platform. The core contributions include the device-farm design, the Coordinator/Operator/Observer agent roles, an action-space and GUI representation tailored for LLMs, and a rigorous evaluation across both controlled tasks and industrial deployment. The work has practical implications for cost-effective, scalable testing of multi-user features in social apps and lays groundwork for cross-app extensions and localized LLM deployments.

Abstract

TikTok, a widely-used social media app boasting over a billion monthly active users, requires effective app quality assurance for its intricate features. Feature testing is crucial in achieving this goal. However, the multi-user interactive features within the app, such as live streaming, voice calls, etc., pose significant challenges for developers, who must handle simultaneous device management and user interaction coordination. To address this, we introduce a novel multi-agent approach, powered by the Large Language Models (LLMs), to automate the testing of multi-user interactive app features. In detail, we build a virtual device farm that allocates the necessary number of devices for a given multi-user interactive task. For each device, we deploy an LLM-based agent that simulates a user, thereby mimicking user interactions to collaboratively automate the testing process. The evaluations on 24 multi-user interactive tasks within the TikTok app, showcase its capability to cover 75% of tasks with 85.9% action similarity and offer 87% time savings for developers. Additionally, we have also integrated our approach into the real-world TikTok testing platform, aiding in the detection of 26 multi-user interactive bugs.

Paper Structure

This paper contains 25 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Illustration of LIVE together.
  • Figure 2: Example of the script for multi-user feature testing.
  • Figure 3: The overview of our approach.
  • Figure 4: The example of prompting agent.
  • Figure 5: Failure examples of our approach.
  • ...and 2 more figures