Table of Contents
Fetching ...

AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

Zhiqian Lan, Yuxuan Jiang, Ruiqi Wang, Xuanbing Xie, Rongkui Zhang, Yicheng Zhu, Peihang Li, Tianshuo Yang, Tianxing Chen, Haoyu Gao, Xiaokang Yang, Xuelong Li, Hongyuan Zhang, Yao Mu, Ping Luo

TL;DR

AutoBio presents a simulator and benchmark to evaluate vision-language-action models in biology labs. It introduces a digitization pipeline for lab instruments, lab-specific physics plugins, and a Blender-based rendering stack to handle transparency and interactive instrument UIs. The benchmark includes 16 biologically grounded tasks across three difficulty levels, with 9 tasks used in experiments, and evaluates two SOTA VLA models (pi_0 and RDT), revealing gaps in precision manipulation, visual reasoning, and instruction following. The work highlights sim-to-real challenges and points to directions for more capable generalist robotic systems in professional environments.

Abstract

Vision-language-action (VLA) models have shown promise as generalist robotic policies by jointly leveraging visual, linguistic, and proprioceptive modalities to generate action trajectories. While recent benchmarks have advanced VLA research in domestic tasks, professional science-oriented domains remain underexplored. We introduce AutoBio, a simulation framework and benchmark designed to evaluate robotic automation in biology laboratory environments--an application domain that combines structured protocols with demanding precision and multimodal interaction. AutoBio extends existing simulation capabilities through a pipeline for digitizing real-world laboratory instruments, specialized physics plugins for mechanisms ubiquitous in laboratory workflows, and a rendering stack that support dynamic instrument interfaces and transparent materials through physically based rendering. Our benchmark comprises biologically grounded tasks spanning three difficulty levels, enabling standardized evaluation of language-guided robotic manipulation in experimental protocols. We provide infrastructure for demonstration generation and seamless integration with VLA models. Baseline evaluations with two SOTA VLA models reveal significant gaps in precision manipulation, visual reasoning, and instruction following in scientific workflows. By releasing AutoBio, we aim to catalyze research on generalist robotic systems for complex, high-precision, and multimodal professional environments. The simulator and benchmark are publicly available to facilitate reproducible research.

AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

TL;DR

AutoBio presents a simulator and benchmark to evaluate vision-language-action models in biology labs. It introduces a digitization pipeline for lab instruments, lab-specific physics plugins, and a Blender-based rendering stack to handle transparency and interactive instrument UIs. The benchmark includes 16 biologically grounded tasks across three difficulty levels, with 9 tasks used in experiments, and evaluates two SOTA VLA models (pi_0 and RDT), revealing gaps in precision manipulation, visual reasoning, and instruction following. The work highlights sim-to-real challenges and points to directions for more capable generalist robotic systems in professional environments.

Abstract

Vision-language-action (VLA) models have shown promise as generalist robotic policies by jointly leveraging visual, linguistic, and proprioceptive modalities to generate action trajectories. While recent benchmarks have advanced VLA research in domestic tasks, professional science-oriented domains remain underexplored. We introduce AutoBio, a simulation framework and benchmark designed to evaluate robotic automation in biology laboratory environments--an application domain that combines structured protocols with demanding precision and multimodal interaction. AutoBio extends existing simulation capabilities through a pipeline for digitizing real-world laboratory instruments, specialized physics plugins for mechanisms ubiquitous in laboratory workflows, and a rendering stack that support dynamic instrument interfaces and transparent materials through physically based rendering. Our benchmark comprises biologically grounded tasks spanning three difficulty levels, enabling standardized evaluation of language-guided robotic manipulation in experimental protocols. We provide infrastructure for demonstration generation and seamless integration with VLA models. Baseline evaluations with two SOTA VLA models reveal significant gaps in precision manipulation, visual reasoning, and instruction following in scientific workflows. By releasing AutoBio, we aim to catalyze research on generalist robotic systems for complex, high-precision, and multimodal professional environments. The simulator and benchmark are publicly available to facilitate reproducible research.

Paper Structure

This paper contains 32 sections, 13 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: AutoBio framework. AutoBio decomposes complex experiments into fundamental biological primitives. These are then implemented via robotic motion primitives within a specialized simulation environment. AutoBio simulator features instrument digitization pipeline, custom physics plugins for lab mechanisms, and rendering stack supporting dynamic interfaces and transparent materials. This enables creation of biologically grounded benchmark tasks to evaluate VLA models on precision control, instruction following, and visual reasoning capabilities in scientific workflows.
  • Figure 2: Digitized instruments for fundamental biological experiment operations, with an example taken from vortex mixer demonstrating the proposed workflow for digitizing real-world instruments.
  • Figure 3: AutoBio physics plugins
  • Figure 4: AutoBio rendering features
  • Figure 5: Task progression across three difficulty levels. Each step includes a bordered inset (top-left) showing supplementary camera perspectives for contextual clarity.
  • ...and 13 more figures