Robustness Testing of Multi-Modal Models in Varied Home Environments for Assistive Robots

Lea Hirlimann; Shengqiang Zhang; Hinrich Schütze; Philipp Wicke

Robustness Testing of Multi-Modal Models in Varied Home Environments for Assistive Robots

Lea Hirlimann, Shengqiang Zhang, Hinrich Schütze, Philipp Wicke

TL;DR

This work tackles the problem of robustness for multi-modal robotic models operating in home environments by introducing disturbances in the AI2Thor simulator to emulate real-world variability relevant to geriatric care. It evaluates three open-source, ALFRED-strong models (HLSM, FILM, EmBERT) across disturbed tasks derived from ALFRED, including dim-lit conditions, glass doors, and reflections from mirrors. Preliminary results reveal that disturbances generally reduce Task Success and Goal Condition Success, with depth sensing offering some resilience for certain models (e.g., HLSM with depth data showing improved metrics in glass-wall scenarios). The study provides a methodology and initial findings to guide robust development and collaboration with geriatrics practitioners, aiming to improve the reliability and safety of assistive robots in real homes.

Abstract

The development of assistive robotic agents to support household tasks is advancing, yet the underlying models often operate in virtual settings that do not reflect real-world complexity. For assistive care robots to be effective in diverse environments, their models must be robust and integrate multiple modalities. Consider a caretaker needing assistance in a dimly lit room or navigating around a newly installed glass door. Models relying solely on visual input might fail in low light, while those using depth information could avoid the door. This demonstrates the necessity for models that can process various sensory inputs. Our ongoing study evaluates state-of-the-art robotic models in the AI2Thor virtual environment. We introduce disturbances, such as dimmed lighting and mirrored walls, to assess their impact on modalities like movement or vision, and object recognition. Our goal is to gather input from the Geriatronics community to understand and model the challenges faced by practitioners.

Robustness Testing of Multi-Modal Models in Varied Home Environments for Assistive Robots

TL;DR

Abstract

Paper Structure (14 sections, 2 figures, 1 table)

This paper contains 14 sections, 2 figures, 1 table.

Introduction
Related Works
Simulation Environments
Modalities for Model Training
Selected Models
Methodology
Model Selection
Disturbances
Tasks Selection
Evaluation Metric
Current Progress
Preliminary Results
Next Steps
Conclusion

Figures (2)

Figure 1: Several state-of-the-art models for assistive robots are evaluated in the AI2Thor environment for their robustness against challenging disturbances, such as dimmed lighting or encountering a glass door (see the above image).
Figure 2: Semantic map produced by FILM during Task #4, where the agent (red marker) registers the glass wall as an obstacle (grey vertical line at top center) and navigates away from it (red dots) without getting stuck

Robustness Testing of Multi-Modal Models in Varied Home Environments for Assistive Robots

TL;DR

Abstract

Robustness Testing of Multi-Modal Models in Varied Home Environments for Assistive Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (2)