Table of Contents
Fetching ...

Bitbox: Behavioral Imaging Toolbox for Computational Analysis of Behavior from Videos

Evangelos Sariyanidi, Gokul Nair, Lisa Yankowitz, Casey J. Zampella, Mohan Kashyap Pargi, Aashvi Manakiwala, Maya McNealis, John D. Herrington, Jeffrey Cohn, Robert T. Schultz, Birkan Tunc

TL;DR

The paper tackles the barrier to adopting AI-based video-based behavioral measurement in behavioral and clinical research by introducing Bitbox, an open-source, modular toolkit that unifies multiple backends under a single API and maps low-level signals to high-level behavioral metrics. It presents a two-layer architecture (backend processors and measurement layer) that supports adding new processors and domain-relevant metrics while preserving reproducibility via automated metadata and containerized deployment. The framework covers visualization, data/metadata management, and open-science features, aiming to accelerate cross-domain dissemination of computational behavioral measures. By enabling researchers to extract robust, interpretable behavioral metrics without engineering expertise, Bitbox seeks to bridge the translational gap between AI advances and behavioral science practice.

Abstract

Computational measurement of human behavior from video has recently become feasible due to major advances in AI. These advances now enable granular and precise quantification of facial expression, head movement, body action, and other behavioral modalities and are increasingly used in psychology, psychiatry, neuroscience, and mental health research. However, mainstream adoption remains slow. Most existing methods and software are developed for engineering audiences, require specialized software stacks, and fail to provide behavioral measurements at a level directly useful for hypothesis-driven research. As a result, there is a large barrier to entry for researchers who wish to use modern, AI-based tools in their work. We introduce Bitbox, an open-source toolkit designed to remove this barrier and make advanced computational analysis directly usable by behavioral scientists and clinical researchers. Bitbox is guided by principles of reproducibility, modularity, and interpretability. It provides a standardized interface for extracting high-level behavioral measurements from video, leveraging multiple face, head, and body processors. The core modules have been tested and validated on clinical samples and are designed so that new measures can be added with minimal effort. Bitbox is intended to serve both sides of the translational gap. It gives behavioral researchers access to robust, high-level behavioral metrics without requiring engineering expertise, and it provides computer scientists a practical mechanism for disseminating methods to domains where their impact is most needed. We expect that Bitbox will accelerate integration of computational behavioral measurement into behavioral, clinical, and mental health research. Bitbox has been designed from the beginning as a community-driven effort that will evolve through contributions from both method developers and domain scientists.

Bitbox: Behavioral Imaging Toolbox for Computational Analysis of Behavior from Videos

TL;DR

The paper tackles the barrier to adopting AI-based video-based behavioral measurement in behavioral and clinical research by introducing Bitbox, an open-source, modular toolkit that unifies multiple backends under a single API and maps low-level signals to high-level behavioral metrics. It presents a two-layer architecture (backend processors and measurement layer) that supports adding new processors and domain-relevant metrics while preserving reproducibility via automated metadata and containerized deployment. The framework covers visualization, data/metadata management, and open-science features, aiming to accelerate cross-domain dissemination of computational behavioral measures. By enabling researchers to extract robust, interpretable behavioral metrics without engineering expertise, Bitbox seeks to bridge the translational gap between AI advances and behavioral science practice.

Abstract

Computational measurement of human behavior from video has recently become feasible due to major advances in AI. These advances now enable granular and precise quantification of facial expression, head movement, body action, and other behavioral modalities and are increasingly used in psychology, psychiatry, neuroscience, and mental health research. However, mainstream adoption remains slow. Most existing methods and software are developed for engineering audiences, require specialized software stacks, and fail to provide behavioral measurements at a level directly useful for hypothesis-driven research. As a result, there is a large barrier to entry for researchers who wish to use modern, AI-based tools in their work. We introduce Bitbox, an open-source toolkit designed to remove this barrier and make advanced computational analysis directly usable by behavioral scientists and clinical researchers. Bitbox is guided by principles of reproducibility, modularity, and interpretability. It provides a standardized interface for extracting high-level behavioral measurements from video, leveraging multiple face, head, and body processors. The core modules have been tested and validated on clinical samples and are designed so that new measures can be added with minimal effort. Bitbox is intended to serve both sides of the translational gap. It gives behavioral researchers access to robust, high-level behavioral metrics without requiring engineering expertise, and it provides computer scientists a practical mechanism for disseminating methods to domains where their impact is most needed. We expect that Bitbox will accelerate integration of computational behavioral measurement into behavioral, clinical, and mental health research. Bitbox has been designed from the beginning as a community-driven effort that will evolve through contributions from both method developers and domain scientists.

Paper Structure

This paper contains 11 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Bitbox system architecture, consisting of two main layers: (1) Backend processor layer, including standalone computer vision tools and associated wrapper functions and (2) measurement layer, including functions that derives high-level behavioral measurements from the outputs of processors. Note that current backends are limited to video processing. Free graphics from Vexels are used in this illustration
  • Figure 2: The most commonly used components of facial analysis workflows, including facial rectangles, landmarks (both 2D and 3D), pose, and expression related movements. Most facial analysis software provides these components, or use them internally.
  • Figure 3: Decomposition of an expression signal into multiple components at different temporal scales ($s$). For each scale, automatically detected peaks that correspond to expression events occurring at different speeds are shown as blue vertical lines.
  • Figure 4: Interactive visualizations from Bitbox for 2D facial videos (left) and corresponding 3D reconstructions (right). Visuals can be exported as PNGs for publications or presentations. For videos, faces can be blurred or fully hidden to protect personally identifiable information (PII).