CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision
Jaewook Lee, Andrew D. Tjahjadi, Jiho Kim, Junpu Yu, Minji Park, Jiawen Zhang, Jon E. Froehlich, Yapeng Tian, Yuhang Zhao
TL;DR
CookAR proposes a wearable AR system that renders real-time affordance-based augmentations (grabbable vs hazardous areas) to aid low-vision users in kitchen tool interactions. The approach combines a newly created egocentric kitchen tool affordance dataset with fine-tuning of RTMDet-Ins-l-Cook for real-time segmentation, and a stereo AR pipeline (ZED Mini + Quest 2) to overlay 3D affordance cues with near real-time latency. Technical evaluation shows the fine-tuned model outperforms the baseline (mAP $0.463$, AP@50 $0.749$, AP@75 $0.486$), while a qualitative LV user study (n=10) reveals a strong preference for affordance augmentations over whole-object overlays, and elicited five new affordance ideas. The work highlights the potential of affordance-focused AR in enhancing safe, efficient tool interaction for LV individuals and provides open-source datasets and models to advance AI-powered AR for accessibility, while noting current limitations in accuracy and latency that future hardware and model improvements should address.
Abstract
Cooking is a central activity of daily living, supporting independence as well as mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV), we present CookAR, a head-mounted AR system with real-time object affordance augmentations to support safe and efficient interactions with kitchen tools. To design and implement CookAR, we collected and annotated the first egocentric dataset of kitchen tool affordances, fine-tuned an affordance segmentation model, and developed an AR system with a stereo camera to generate visual augmentations. To validate CookAR, we conducted a technical evaluation of our fine-tuned model as well as a qualitative lab study with 10 LV participants for suitable augmentation design. Our technical evaluation demonstrates that our model outperforms the baseline on our tool affordance dataset, while our user study indicates a preference for affordance augmentations over the traditional whole object augmentations.
