Cognitively-Inspired Emergent Communication via Knowledge Graphs for Assisting the Visually Impaired
Ruxiao Chen, Dezheng Han, Wenjie Han, Shuaishuai Guo
TL;DR
This work tackles the need for fast yet semantically rich guidance for visually impaired users in dynamic environments by introducing VAG-EC, a cognitively inspired emergent communication framework that grounds messages in knowledge graphs. Scenes are converted into object-centric graphs via SAM segmentation and proximity-based edges, with Graph Convolutional Networks and attention producing a structured representation that informs compact symbolic messages. Messages are learned through a Lewis-style referential game using a differentiable Gumbel-Softmax relaxation, enabling end-to-end optimization. Across varying vocabulary sizes, VAG-EC achieves higher Context Independence (CI) and Topographic Similarity (TopSim) than baselines, and exhibits more balanced token usage, reflecting stronger semantic grounding and interpretability. The approach demonstrates potential for real-time, human-aligned assistive modalities, though broader-domain and human-in-the-loop evaluations remain as future directions.
Abstract
Assistive systems for visually impaired individuals must deliver rapid, interpretable, and adaptive feedback to facilitate real-time navigation. Current approaches face a trade-off between latency and semantic richness: natural language-based systems provide detailed guidance but are too slow for dynamic scenarios, while emergent communication frameworks offer low-latency symbolic languages but lack semantic depth, limiting their utility in tactile modalities like vibration. To address these limitations, we introduce a novel framework, Cognitively-Inspired Emergent Communication via Knowledge Graphs (VAG-EC), which emulates human visual perception and cognitive mapping. Our method constructs knowledge graphs to represent objects and their relationships, incorporating attention mechanisms to prioritize task-relevant entities, thereby mirroring human selective attention. This structured approach enables the emergence of compact, interpretable, and context-sensitive symbolic languages. Extensive experiments across varying vocabulary sizes and message lengths demonstrate that VAG-EC outperforms traditional emergent communication methods in Topographic Similarity (TopSim) and Context Independence (CI). These findings underscore the potential of cognitively grounded emergent communication as a fast, adaptive, and human-aligned solution for real-time assistive technologies. Code is available at https://github.com/Anonymous-NLPcode/Anonymous_submission/tree/main.
