Table of Contents
Fetching ...

DIJIT: A Robotic Head for an Active Observer

Mostafa Kamali Tabrizi, Mingshi Chi, Bir Bikram Dey, Yu Qing Yuan, Markus D. Solbach, Yiqian Liu, Michael Jenkin, John K. Tsotsos

TL;DR

DIJIT introduces a fully binocular, human-inspired robotic head with nine mechanical degrees of freedom and four optical degrees of freedom per camera to enable active vision research and cross-domain comparisons with human vision. A data-driven saccade control approach uses calibration-based homographies to map fixation points to motor commands, avoiding complex kinematic modeling. Experimental results demonstrate human-like saccade accuracy and high peak speeds, with open-source hardware and software facilitating replication and further study. The work advances active binocular perception and lays groundwork for binocular 3D reconstruction and assistive robotics tasks.

Abstract

We present DIJIT, a novel binocular robotic head expressly designed for mobile agents that behave as active observers. DIJIT's unique breadth of functionality enables active vision research and the study of human-like eye and head-neck motions, their interrelationships, and how each contributes to visual ability. DIJIT is also being used to explore the differences between how human vision employs eye/head movements to solve visual tasks and current computer vision methods. DIJIT's design features nine mechanical degrees of freedom, while the cameras and lenses provide an additional four optical degrees of freedom. The ranges and speeds of the mechanical design are comparable to human performance. Our design includes the ranges of motion required for convergent stereo, namely, vergence, version, and cyclotorsion. The exploration of the utility of these to both human and machine vision is ongoing. Here, we present the design of DIJIT and evaluate aspects of its performance. We present a new method for saccadic camera movements. In this method, a direct relationship between camera orientation and motor values is developed. The resulting saccadic camera movements are close to human movements in terms of their accuracy.

DIJIT: A Robotic Head for an Active Observer

TL;DR

DIJIT introduces a fully binocular, human-inspired robotic head with nine mechanical degrees of freedom and four optical degrees of freedom per camera to enable active vision research and cross-domain comparisons with human vision. A data-driven saccade control approach uses calibration-based homographies to map fixation points to motor commands, avoiding complex kinematic modeling. Experimental results demonstrate human-like saccade accuracy and high peak speeds, with open-source hardware and software facilitating replication and further study. The work advances active binocular perception and lays groundwork for binocular 3D reconstruction and assistive robotics tasks.

Abstract

We present DIJIT, a novel binocular robotic head expressly designed for mobile agents that behave as active observers. DIJIT's unique breadth of functionality enables active vision research and the study of human-like eye and head-neck motions, their interrelationships, and how each contributes to visual ability. DIJIT is also being used to explore the differences between how human vision employs eye/head movements to solve visual tasks and current computer vision methods. DIJIT's design features nine mechanical degrees of freedom, while the cameras and lenses provide an additional four optical degrees of freedom. The ranges and speeds of the mechanical design are comparable to human performance. Our design includes the ranges of motion required for convergent stereo, namely, vergence, version, and cyclotorsion. The exploration of the utility of these to both human and machine vision is ongoing. Here, we present the design of DIJIT and evaluate aspects of its performance. We present a new method for saccadic camera movements. In this method, a direct relationship between camera orientation and motor values is developed. The resulting saccadic camera movements are close to human movements in terms of their accuracy.

Paper Structure

This paper contains 15 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The DIJIT head from four different viewpoints. The upper part of the head, excluding the neck, is 22cm$\times$18cm$\times$12cm (W$\times$D$\times$H), and the whole head, including the neck, is 22cm$\times$22cm$\times$26cm.
  • Figure 2: The degrees of freedom of the neck (a) and cameras (b).
  • Figure 3: (a) The degrees of freedom of the right camera. (b) Red boundaries highlight the motor and linkages responsible for the primary tilting motion, green boundaries highlight those for the primary pan motion, and blue boundaries highlight those responsible for the roll motion.
  • Figure 4: Views from DIJIT: The red point indicates the image center in both images. (a) Before saccade execution, the green point indicates the target point for fixation. (b) After a precise saccade execution, the image is centered at the intended fixation point.
  • Figure 5: Landing points after saccade executions (191 fixations). (a) An image of similar size to DIJIT images, the green circle is centered at the image center and has a radius of $1^\circ$. (b) A zoomed-in portion of the center of the image in (a).
  • ...and 1 more figures