Table of Contents
Fetching ...

OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving

Xiaoyu Liang, Ziang Liu, Kelvin Lin, Edward Gu, Ruolin Ye, Tam Nguyen, Cynthia Hsu, Zhanxin Wu, Xiaoman Yang, Christy Sum Yu Cheung, Harold Soh, Katherine Dimitropoulou, Tapomayukh Bhattacharjee

TL;DR

OpenRoboCare provides a large, expert-collected multimodal dataset for robot caregiving, aggregating 315 sessions across 21 occupational therapists performing 15 ADLs on two manikins to yield 19.8 hours and 31,185 samples from five modalities. By capturing perception, safe physical interactions, and long-horizon planning, the work distills guiding OT principles and demonstrates gaps in current perception and planning methods, establishing the dataset as a challenging benchmark. The findings show that fine-tuning on a small subset can substantially improve performance, underscoring the dataset’s practical value for developing robust, adaptive robot caregivers. Overall, OpenRoboCare aims to accelerate multimodal learning and safe automation in real-world caregiving through rich data and actionable guidelines.

Abstract

We present OpenRoboCare, a multimodal dataset for robot caregiving, capturing expert occupational therapist demonstrations of Activities of Daily Living (ADLs). Caregiving tasks involve complex physical human-robot interactions, requiring precise perception under occlusions, safe physical contact, and long-horizon planning. While recent advances in robot learning from demonstrations have shown promise, there is a lack of a large-scale, diverse, and expert-driven dataset that captures real-world caregiving routines. To address this gap, we collect data from 21 occupational therapists performing 15 ADL tasks on two manikins. The dataset spans five modalities: RGB-D video, pose tracking, eye-gaze tracking, task and action annotations, and tactile sensing, providing rich multimodal insights into caregiver movement, attention, force application, and task execution strategies. We further analyze expert caregiving principles and strategies, offering insights to improve robot efficiency and task feasibility. Additionally, our evaluations demonstrate that OpenRoboCare presents challenges for state-of-the-art robot perception and human activity recognition methods, both critical for developing safe and adaptive assistive robots, highlighting the value of our contribution. See our website for additional visualizations: https://emprise.cs.cornell.edu/robo-care/.

OpenRoboCare: A Multimodal Multi-Task Expert Demonstration Dataset for Robot Caregiving

TL;DR

OpenRoboCare provides a large, expert-collected multimodal dataset for robot caregiving, aggregating 315 sessions across 21 occupational therapists performing 15 ADLs on two manikins to yield 19.8 hours and 31,185 samples from five modalities. By capturing perception, safe physical interactions, and long-horizon planning, the work distills guiding OT principles and demonstrates gaps in current perception and planning methods, establishing the dataset as a challenging benchmark. The findings show that fine-tuning on a small subset can substantially improve performance, underscoring the dataset’s practical value for developing robust, adaptive robot caregivers. Overall, OpenRoboCare aims to accelerate multimodal learning and safe automation in real-world caregiving through rich data and actionable guidelines.

Abstract

We present OpenRoboCare, a multimodal dataset for robot caregiving, capturing expert occupational therapist demonstrations of Activities of Daily Living (ADLs). Caregiving tasks involve complex physical human-robot interactions, requiring precise perception under occlusions, safe physical contact, and long-horizon planning. While recent advances in robot learning from demonstrations have shown promise, there is a lack of a large-scale, diverse, and expert-driven dataset that captures real-world caregiving routines. To address this gap, we collect data from 21 occupational therapists performing 15 ADL tasks on two manikins. The dataset spans five modalities: RGB-D video, pose tracking, eye-gaze tracking, task and action annotations, and tactile sensing, providing rich multimodal insights into caregiver movement, attention, force application, and task execution strategies. We further analyze expert caregiving principles and strategies, offering insights to improve robot efficiency and task feasibility. Additionally, our evaluations demonstrate that OpenRoboCare presents challenges for state-of-the-art robot perception and human activity recognition methods, both critical for developing safe and adaptive assistive robots, highlighting the value of our contribution. See our website for additional visualizations: https://emprise.cs.cornell.edu/robo-care/.

Paper Structure

This paper contains 22 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of OpenRoboCare dataset for robot caregiving, featuring 21 occupational therapists demonstrating 15 common caregiving tasks, captured across 5 data modalities. It consists of 315 sessions, totaling 19.8 hours, with a collection of 31,185 samples.
  • Figure 2: Data collection setup and procedure. Left: setup of sensors and equipment. Center: assistive devices used by caregivers. Right: sequence of tasks performed by each caregiver.
  • Figure 3: Tactile skin design and layout of sensors on manikin.
  • Figure 4: Analysis of Dataset Characteristics. We analyze the diversity of the dataset across different aspects: (a-c) general data collection statistics; (d-f) occupational therapists' strategies; (g) time duration across tasks; (h-j) physical contact characteristics; and (k-l) force magnitude information.