Table of Contents
Fetching ...

Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation

Wazeer Zulfikar, Samantha Chan, Pattie Maes

TL;DR

It is demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality, which contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.

Abstract

People have to remember an ever-expanding volume of information. Wearables that use information capture and retrieval for memory augmentation can help but can be disruptive and cumbersome in real-world tasks, such as in social settings. To address this, we developed Memoro, a wearable audio-based memory assistant with a concise user interface. Memoro uses a large language model (LLM) to infer the user's memory needs in a conversational context, semantically search memories, and present minimal suggestions. The assistant has two interaction modes: Query Mode for voicing queries and Queryless Mode for on-demand predictive assistance, without explicit query. Our study of (N=20) participants engaged in a real-time conversation demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality. We report quantitative results and discuss the preferences and experiences of users. This work contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.

Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation

TL;DR

It is demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality, which contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.

Abstract

People have to remember an ever-expanding volume of information. Wearables that use information capture and retrieval for memory augmentation can help but can be disruptive and cumbersome in real-world tasks, such as in social settings. To address this, we developed Memoro, a wearable audio-based memory assistant with a concise user interface. Memoro uses a large language model (LLM) to infer the user's memory needs in a conversational context, semantically search memories, and present minimal suggestions. The assistant has two interaction modes: Query Mode for voicing queries and Queryless Mode for on-demand predictive assistance, without explicit query. Our study of (N=20) participants engaged in a real-time conversation demonstrated that using Memoro reduced device interaction time and increased recall confidence while preserving conversational quality. We report quantitative results and discuss the preferences and experiences of users. This work contributes towards utilizing LLMs to design wearable memory augmentation systems that are minimally disruptive.
Paper Structure (62 sections, 8 figures, 2 tables)

This paper contains 62 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: The closed loop system architecture has a memory encoder that is continuously updated using text-to-speech. The system can be configured to use query or queryless mode. In the Query Mode, the explicit query is voiced by the user, while in the Queryless Mode, the query agent infers the query. The query and memories are inputted to the retrieval agent, which returns a concise memory suggestion that is delivered to the user through bone conduction
  • Figure 2: Detailed workflow of the components of Memoro: memory encoder, retrieval agent, and query agent. The memory encoder takes speech transcriptions and maintains the context and external memories. The query agent takes in the context and produces an inferred query. The retrieval agent takes a query and retrieves an answer from external memories.
  • Figure 3: Procedure for the user study for each participant
  • Figure 4: Example interactions by P3 and P19 show the Query Mode and the Queryless mode for the same memory respectively. The timestamps are changed for reporting.
  • Figure 5: Task Performance and task load : (a) Confidence in recalling, (b) Relevance of recalled information, (c) Perceived difficulty in recalling, and (d) Raw NASA TLX scores. ***: $p$<$.001$
  • ...and 3 more figures