Table of Contents
Fetching ...

Nods of Agreement: Webcam-Driven Avatars Improve Meeting Outcomes and Avatar Satisfaction Over Audio-Driven or Static Avatars in All-Avatar Work Videoconferencing

Fang Ma, Ju Zhang, Lev Tankelevitch, Payod Panda, Torang Asadi, Charlie Hewitt, Lohit Petikam, James Clemoes, Marco Gillies, Xueni Pan, Sean Rintel, Marta Wilczkowiak

TL;DR

This work investigates how avatar animation modalities influence work meeting outcomes and satisfaction by conducting a within-subjects mixed-methods study with 68 employees using three avatar modalities. The findings show that webcam-driven avatar motion improves meeting effectiveness, comfort, and inclusivity compared to static avatars, and generally exceeds audio-only animation for avatar satisfaction, with strong preference for webcam animation in focus-group selection. Qualitative analysis reveals ten thematic factors, emphasizing expressiveness, nonverbal cues, and cognitive load as key drivers, and demonstrates that meaningful motion often matters more than visual realism for meeting outcomes. The results support adopting webcam-animated avatars as a plausible alternative to video in remote work, while highlighting design tradeoffs between motion fidelity and appearance realism for different objectives.

Abstract

Avatars are edging into mainstream videoconferencing, but evaluation of how avatar animation modalities contribute to work meeting outcomes has been limited. We report a within-group videoconferencing experiment in which 68 employees of a global technology company, in 16 groups, used the same stylized avatars in three modalities (static picture, audio-animation, and webcam-animation) to complete collaborative decision-making tasks. Quantitatively, for meeting outcomes, webcam-animated avatars improved meeting effectiveness over the picture modality and were also reported to be more comfortable and inclusive than both other modalities. In terms of avatar satisfaction, there was a similar preference for webcam animation as compared to both other modalities. Our qualitative analysis shows participants expressing a preference for the holistic motion of webcam animation, and that meaningful movement outweighs realism for meeting outcomes, as evidenced through a systematic overview of ten thematic factors. We discuss implications for research and commercial deployment and conclude that webcam-animated avatars are a plausible alternative to video in work meetings.

Nods of Agreement: Webcam-Driven Avatars Improve Meeting Outcomes and Avatar Satisfaction Over Audio-Driven or Static Avatars in All-Avatar Work Videoconferencing

TL;DR

This work investigates how avatar animation modalities influence work meeting outcomes and satisfaction by conducting a within-subjects mixed-methods study with 68 employees using three avatar modalities. The findings show that webcam-driven avatar motion improves meeting effectiveness, comfort, and inclusivity compared to static avatars, and generally exceeds audio-only animation for avatar satisfaction, with strong preference for webcam animation in focus-group selection. Qualitative analysis reveals ten thematic factors, emphasizing expressiveness, nonverbal cues, and cognitive load as key drivers, and demonstrates that meaningful motion often matters more than visual realism for meeting outcomes. The results support adopting webcam-animated avatars as a plausible alternative to video in remote work, while highlighting design tradeoffs between motion fidelity and appearance realism for different objectives.

Abstract

Avatars are edging into mainstream videoconferencing, but evaluation of how avatar animation modalities contribute to work meeting outcomes has been limited. We report a within-group videoconferencing experiment in which 68 employees of a global technology company, in 16 groups, used the same stylized avatars in three modalities (static picture, audio-animation, and webcam-animation) to complete collaborative decision-making tasks. Quantitatively, for meeting outcomes, webcam-animated avatars improved meeting effectiveness over the picture modality and were also reported to be more comfortable and inclusive than both other modalities. In terms of avatar satisfaction, there was a similar preference for webcam animation as compared to both other modalities. Our qualitative analysis shows participants expressing a preference for the holistic motion of webcam animation, and that meaningful movement outweighs realism for meeting outcomes, as evidenced through a systematic overview of ten thematic factors. We discuss implications for research and commercial deployment and conclude that webcam-animated avatars are a plausible alternative to video in work meetings.

Paper Structure

This paper contains 38 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Overview of the experimental protocol based on the research questions. To adequately address RQ2 (avatar satisfaction), (a) personalized avatars were generated based on participants' self videos and photographs. (b) The animation modality order for the avatars was randomized in advance for each group. Avatars were streamed to a Microsoft Teams videoconferencing meeting for the experiment. To address RQ1 (meeting outcomes), (c) participants in a group joined a Microsoft Teams meeting using their avatars to complete the experimental task sessions. Before starting the tasks, participants individually completed an onboarding questionnaire. (d) They then performed three, five-minute group decision-making tasks, each using a different avatar animation modality: static picture mode (S), audio-animated mode (A), and webcam-animated mode (W). Each task was followed by a post-task questionnaire. Lastly, participants completed a 15-minute focus group, for which participants could use any of the three avatar modalities they individually preferred.
  • Figure 2: Quantitative results comparing the three avatar modalities on the outcomes of meeting effectiveness, alignment, comfort, and inclusivity. Violin plots depict distributions of numeric data using density curves. Red dots and lines depict individual responses, illustrating comparisons across the three modalities. Black crosses indicate mean values. For effectiveness, a yes/no response scale was used to measure whether the meeting reached an agreed decision. For alignment, comfort, and inclusivity, a five-point response scale was used, where 1 indicates 'low' and 5 indicates 'high' on the response scale, respectively. * indicates p < 0.05, ** indicates p < 0.01, 'ns' indicates not statistically significant.
  • Figure 3: Thematic interrelations between the meeting outcome: effectiveness ($\textit{Q}_{\textit{\small{MO-eff-text}}}$) and comfort ($\textit{Q}_{\textit{\small{MO-comf-text}}}$) and avatar satisfaction factors: preference ($\textit{Q}_{\textit{\small{AS-pref-text}}}$) ranked in descending order in impact themes (primary, secondary and tertiary sections). Additional qualitative Sankey Flow interpretation see supplementary material.