CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Jaewook Lee; Andrew D. Tjahjadi; Jiho Kim; Junpu Yu; Minji Park; Jiawen Zhang; Jon E. Froehlich; Yapeng Tian; Yuhang Zhao

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Jaewook Lee, Andrew D. Tjahjadi, Jiho Kim, Junpu Yu, Minji Park, Jiawen Zhang, Jon E. Froehlich, Yapeng Tian, Yuhang Zhao

TL;DR

CookAR proposes a wearable AR system that renders real-time affordance-based augmentations (grabbable vs hazardous areas) to aid low-vision users in kitchen tool interactions. The approach combines a newly created egocentric kitchen tool affordance dataset with fine-tuning of RTMDet-Ins-l-Cook for real-time segmentation, and a stereo AR pipeline (ZED Mini + Quest 2) to overlay 3D affordance cues with near real-time latency. Technical evaluation shows the fine-tuned model outperforms the baseline (mAP $0.463$, AP@50 $0.749$, AP@75 $0.486$), while a qualitative LV user study (n=10) reveals a strong preference for affordance augmentations over whole-object overlays, and elicited five new affordance ideas. The work highlights the potential of affordance-focused AR in enhancing safe, efficient tool interaction for LV individuals and provides open-source datasets and models to advance AI-powered AR for accessibility, while noting current limitations in accuracy and latency that future hardware and model improvements should address.

Abstract

Cooking is a central activity of daily living, supporting independence as well as mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV), we present CookAR, a head-mounted AR system with real-time object affordance augmentations to support safe and efficient interactions with kitchen tools. To design and implement CookAR, we collected and annotated the first egocentric dataset of kitchen tool affordances, fine-tuned an affordance segmentation model, and developed an AR system with a stereo camera to generate visual augmentations. To validate CookAR, we conducted a technical evaluation of our fine-tuned model as well as a qualitative lab study with 10 LV participants for suitable augmentation design. Our technical evaluation demonstrates that our model outperforms the baseline on our tool affordance dataset, while our user study indicates a preference for affordance augmentations over the traditional whole object augmentations.

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

TL;DR

, AP@50

, AP@75

), while a qualitative LV user study (n=10) reveals a strong preference for affordance augmentations over whole-object overlays, and elicited five new affordance ideas. The work highlights the potential of affordance-focused AR in enhancing safe, efficient tool interaction for LV individuals and provides open-source datasets and models to advance AI-powered AR for accessibility, while noting current limitations in accuracy and latency that future hardware and model improvements should address.

Abstract

Paper Structure (33 sections, 6 figures, 2 tables)

This paper contains 33 sections, 6 figures, 2 tables.

Introduction
Related Work
Challenges in Low Vision Cooking
Using Wearable AR to Enhance Accessibility
Affordance Segmentation
System Implementation
Data Collection and Annotation
Model Fine-Tuning
The CookAR Prototype
Technical Evaluation
Methods
Results
User Study
Participants
Apparatus
...and 18 more sections

Figures (6)

Figure 1: Example Roboflow annotations for each object class in our dataset (18 classes total).
Figure 2: System overview of CookAR showing how data flows from the ZED Mini Stereo camera to an external camera for affordance segmentation, then sent back to ZED for rendering on the Quest 2 headset.
Figure 3: Example inferencing results on images from the test subset of our dataset. These images demonstrate how the RTMDet-Ins-l-Cook identifies and segments graspable, safe areas—even in the presence of hands or other partial occlusions.
Figure 4: The CookAR prototype with whole object augmentations (left) and affordance augmentations (right). The whole object augmentations are green instance segmentation masks, while the affordance augmentations are green (grabbable) and red (hazard) affordance segmentation masks.
Figure 5: Design probes used in Part 3 of the study to spark design ideas.
...and 1 more figures

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

TL;DR

Abstract

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Authors

TL;DR

Abstract

Table of Contents

Figures (6)