Table of Contents
Fetching ...

Viser: Imperative, Web-based 3D Visualization in Python

Brent Yi, Chung Min Kim, Justin Kerr, Gina Wu, Rebecca Feng, Anthony Zhang, Jonas Kulhanek, Hongsuk Choi, Yi Ma, Matthew Tancik, Angjoo Kanazawa

TL;DR

Viser addresses the need for easy yet extensible 3D visualization in computer vision and robotics by offering a web-based viewer, scene primitives, and GUI components with an imperative Python API. Its architecture separates user-facing APIs from the web frontend, enabling bidirectional Python-web synchronization and real-time visualization. The approach blends general-purpose visualization with domain-specific capabilities, aiming to accelerate research and debugging on both offline data and real-time robotics. Limitations include WebSocket overhead and Python-only bindings, but the design prioritizes simplicity and integration with common Python workflows.

Abstract

We present Viser, a 3D visualization library for computer vision and robotics. Viser aims to bring easy and extensible 3D visualization to Python: we provide a comprehensive set of 3D scene and 2D GUI primitives, which can be used independently with minimal setup or composed to build specialized interfaces. This technical report describes Viser's features, interface, and implementation. Key design choices include an imperative-style API and a web-based viewer, which improve compatibility with modern programming patterns and workflows.

Viser: Imperative, Web-based 3D Visualization in Python

TL;DR

Viser addresses the need for easy yet extensible 3D visualization in computer vision and robotics by offering a web-based viewer, scene primitives, and GUI components with an imperative Python API. Its architecture separates user-facing APIs from the web frontend, enabling bidirectional Python-web synchronization and real-time visualization. The approach blends general-purpose visualization with domain-specific capabilities, aiming to accelerate research and debugging on both offline data and real-time robotics. Limitations include WebSocket overhead and Python-only bindings, but the design prioritizes simplicity and integration with common Python workflows.

Abstract

We present Viser, a 3D visualization library for computer vision and robotics. Viser aims to bring easy and extensible 3D visualization to Python: we provide a comprehensive set of 3D scene and 2D GUI primitives, which can be used independently with minimal setup or composed to build specialized interfaces. This technical report describes Viser's features, interface, and implementation. Key design choices include an imperative-style API and a web-based viewer, which improve compatibility with modern programming patterns and workflows.

Paper Structure

This paper contains 10 sections, 8 figures.

Figures (8)

  • Figure 1: Visualization for computer vision. Viser provides a web-based viewer, scene primitives, and GUI primitives for visualization. These can be composed for visualization in a broad set of applications. From top-left: (i) Monocular 4D reconstruction from Shape of Motion wang2024shapeofmotion, showing dynamic scene render and point tracks. (ii) Visualization of mip-NeRF 360 dataset barron2022mipnerf360 from COLMAP schonberger2016structure, with camera poses and point clouds. (iii) Interactive SMPL loper2015smpl human model viewer with pose and shape parameter controls. (iv) Dynamic human motion, contact, and scene visualization from VideoMimic allshire2025videomimic.
  • Figure 2: Visualization for robotics. Viser's scene and GUI primitives are useful for common robotics problems. From top-left: (i) Interactive inverse kinematics with 6D pose input in PyRoki kim2025pyroki. (ii) Policy rollout visualization for reinforcement learning in VideoMimic allshire2025videomimic. (iii) Humanoid control visualization from policies trained in IsaacGym makoviychuk2021isaacgym. (iv) Batched rendering for parallel simulation in MuJoCo Playground zakka2025mujoco.
  • Figure 3: Nerfstudio tancik2023nerfstudio. An example of a domain-specific tool built with Viser's scene and GUI primitives. The viewer supports real-time rendering of neural radiance fields mildenhall2020nerf and 3D Gaussian splats kerbl2023gaussian, training visualization, and camera path creation.
  • Figure 4: Real-time visualization for robotics. Viser enables live debugging of perception and control systems on physical robots. Left: A humanoid robot executing a learned locomotion policy allshire2025videomimic. Right: Visualizing real-time state estimation and mapping in Viser. Viser's web-based architecture simplifies remote monitoring and debugging.
  • Figure 5: Populating Viser. Viser is used to display inputs and outputs from a multiview reconstruction pipeline muller2025reconstructing. We show (a) the viewer, (b) example code for adding 3D scene elements, and (c) example code for populating the graphical interface.
  • ...and 3 more figures