Table of Contents
Fetching ...

ARPOV: Expanding Visualization of Object Detection in AR with Panoramic Mosaic Stitching

Erin McGowan, Ethan Brewer, Claudio Silva

TL;DR

The proposed ARPOV is an interactive visual analytics tool for analyzing object detection model outputs tailored to video captured by an AR headset that maximizes user understanding of model performance and leverages panorama stitching to expand the view of the environment while automatically filtering undesirable frames.

Abstract

As the uses of augmented reality (AR) become more complex and widely available, AR applications will increasingly incorporate intelligent features that require developers to understand the user's behavior and surrounding environment (e.g. an intelligent assistant). Such applications rely on video captured by an AR headset, which often contains disjointed camera movement with a limited field of view that cannot capture the full scope of what the user sees at any given time. Moreover, standard methods of visualizing object detection model outputs are limited to capturing objects within a single frame and timestep, and therefore fail to capture the temporal and spatial context that is often necessary for various domain applications. We propose ARPOV, an interactive visual analytics tool for analyzing object detection model outputs tailored to video captured by an AR headset that maximizes user understanding of model performance. The proposed tool leverages panorama stitching to expand the view of the environment while automatically filtering undesirable frames, and includes interactive features that facilitate object detection model debugging. ARPOV was designed as part of a collaboration between visualization researchers and machine learning and AR experts; we validate our design choices through interviews with 5 domain experts.

ARPOV: Expanding Visualization of Object Detection in AR with Panoramic Mosaic Stitching

TL;DR

The proposed ARPOV is an interactive visual analytics tool for analyzing object detection model outputs tailored to video captured by an AR headset that maximizes user understanding of model performance and leverages panorama stitching to expand the view of the environment while automatically filtering undesirable frames.

Abstract

As the uses of augmented reality (AR) become more complex and widely available, AR applications will increasingly incorporate intelligent features that require developers to understand the user's behavior and surrounding environment (e.g. an intelligent assistant). Such applications rely on video captured by an AR headset, which often contains disjointed camera movement with a limited field of view that cannot capture the full scope of what the user sees at any given time. Moreover, standard methods of visualizing object detection model outputs are limited to capturing objects within a single frame and timestep, and therefore fail to capture the temporal and spatial context that is often necessary for various domain applications. We propose ARPOV, an interactive visual analytics tool for analyzing object detection model outputs tailored to video captured by an AR headset that maximizes user understanding of model performance. The proposed tool leverages panorama stitching to expand the view of the environment while automatically filtering undesirable frames, and includes interactive features that facilitate object detection model debugging. ARPOV was designed as part of a collaboration between visualization researchers and machine learning and AR experts; we validate our design choices through interviews with 5 domain experts.
Paper Structure (18 sections, 6 figures)

This paper contains 18 sections, 6 figures.

Figures (6)

  • Figure 1: The Annotated Range Slider, including (A) range selection for the start and end time of stitching window; (B) tick marks denoting frames in which a new object label has been detected (green), frames in which the same object label has been detected multiple times (orange), and times when a previously detected object label has not been detected for a specified count of frames (red); (C) on-hover text boxes with tick mark details.
  • Figure 2: The Timeline View allows users to toggle between three displays: (A) the Summary Matrix View, which displays model prediction confidence and the intersection over union (IoU) of predicted bounding boxes with ground truth bounding boxes; (B) the Detection Classification View, with the counts of true positive, true negative, false positive, and false negative predictions for each frame; (C) the Distance View, which shows the distance between the centroids of consecutive predicted bounding boxes for each object adjusted within a panorama of selected frames.
  • Figure 3: The panorama construction pipeline takes in raw frames (A), computes and filters their homographies (B), positions (C) and composites (D) them on the canvas and overlays corresponding ODM outputs (E).
  • Figure 4: ARPOV enables users to view ODM outputs in three different visualization styles: (A) Bounding Boxes, (B) Centroids, and (C) Arrows. Each color denotes a different object label, as shown in the key.
  • Figure 5: The ARPOV homography filtering feature removes frames causing distortion in the panoramic mosaic. Shown are (A) a panorama generated using out-of-the-box OpenCV functions, (B) the same panorama with frames stretched far out of proportion with the rest of the mosaic removed, and (C) the same panorama with vertically and horizontally flipped frames removed.
  • ...and 1 more figures