Table of Contents
Fetching ...

BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI

Shwetha Rajaram, Nels Numan, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, Andrew D. Wilson

TL;DR

BlendScape introduces a rendering and composition system that enables end users to customize video-conferencing environments with AI-driven creation. It grounds generation in real or virtual backgrounds via inpainting and image-to-image restyling, and augments prompts with multimodal and LLM-driven guidance to shape meeting contexts. Through three demonstration scenarios and a 15-user exploratory study, the work shows broad potential for expressive, context-aware environments while highlighting realism and distraction trade-offs, and outlines concrete improvements for live deployment. The work advances user-centered, AI-assisted design for distributed collaboration and points to future work on per-user customization, automatic activity transitions, and improving spatial-coherence in blended scenes.

Abstract

Today's video-conferencing tools support a rich range of professional and social activities, but their generic meeting environments cannot be dynamically adapted to align with distributed collaborators' needs. To enable end-user customization, we developed BlendScape, a rendering and composition system for video-conferencing participants to tailor environments to their meeting context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users' physical or digital backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an exploratory study with 15 end-users, we investigated whether and how they would find value in using generative AI to customize video-conferencing environments. Participants envisioned using a system like BlendScape to facilitate collaborative activities in the future, but required further controls to mitigate distracting or unrealistic visual elements. We implemented scenarios to demonstrate BlendScape's expressiveness for supporting environment design strategies from prior work and propose composition techniques to improve the quality of environments.

BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI

TL;DR

BlendScape introduces a rendering and composition system that enables end users to customize video-conferencing environments with AI-driven creation. It grounds generation in real or virtual backgrounds via inpainting and image-to-image restyling, and augments prompts with multimodal and LLM-driven guidance to shape meeting contexts. Through three demonstration scenarios and a 15-user exploratory study, the work shows broad potential for expressive, context-aware environments while highlighting realism and distraction trade-offs, and outlines concrete improvements for live deployment. The work advances user-centered, AI-assisted design for distributed collaboration and points to future work on per-user customization, automatic activity transitions, and improving spatial-coherence in blended scenes.

Abstract

Today's video-conferencing tools support a rich range of professional and social activities, but their generic meeting environments cannot be dynamically adapted to align with distributed collaborators' needs. To enable end-user customization, we developed BlendScape, a rendering and composition system for video-conferencing participants to tailor environments to their meeting context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users' physical or digital backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an exploratory study with 15 end-users, we investigated whether and how they would find value in using generative AI to customize video-conferencing environments. Participants envisioned using a system like BlendScape to facilitate collaborative activities in the future, but required further controls to mitigate distracting or unrealistic visual elements. We implemented scenarios to demonstrate BlendScape's expressiveness for supporting environment design strategies from prior work and propose composition techniques to improve the quality of environments.
Paper Structure (61 sections, 15 figures)

This paper contains 61 sections, 15 figures.

Figures (15)

  • Figure 1: Classification of Environment Design Strategies: We analyzed how existing video-conferencing tools compose meeting spaces to support distributed collaboration by (A) depicting a shared context, (B) enhancing communication behaviors through spatial metaphors, (C) capturing a record of collaboration within the space. In Sec. \ref{['sec:scenario-demonstration']}, we use scenarios to demonstrate how BlendScape supports implementing eight of these ten design strategies (shown in bold).
  • Figure 2: Scenario 1: Design Brainstorming. To create a unified setting for brainstorming, two designers use BlendScape to blend their webcam backgrounds with a camera feed of a physical desk (a, b), enabling them to ideate around hand-drawn sketches. They later blend in elements of their digital task space, such as mock-ups of a mixed reality interface (c).
  • Figure 3: Overview of BlendScape interface: BlendScape offers two composition modes for creating meeting spaces (a): blending webcam feeds together via inpainting and transforming the image on the canvas via image-to-image. To steer the environment generation, end-users can specify text-based prompts for the Meeting Activity and Meeting Theme(c), control the strength of stylistic prompts (b), upload custom image priors (f), and modify specific regions of the scene via selection tools (e). Users can return to and iterate on previous environment designs via the history tools (g). The automatic layout techniques facilitate positioning users behind foreground objects in the scene (d). BlendScape also provides session management tools (h) and per-user controls for adjusting the proportion of their video backgrounds preserved during the environment generation and toggling between displaying live webcam feeds or static frames (i).
  • Figure 4: Environment Generation Techniques. BlendScape supports composing meeting spaces through (a) blending video feeds together via inpainting techniques and (b) transforming an input image (i.e., image prior) via image-to-image techniques. These composition approaches can be chained, e.g., to restyle a blended environment in the theme of a library (c).
  • Figure 5: Masking Video Backgrounds: Users can adjust the proportion of their physical or virtual surroundings to retain in the resulting blended environments.
  • ...and 10 more figures