Table of Contents
Fetching ...

Toward Conversational Agents with Context and Time Sensitive Long-term Memory

Nick Alonso, Tomás Figliolia, Anthony Ndirango, Beren Millidge

TL;DR

This paper tackles the challenge of enabling conversational agents to recall information based on conversational meta-data and ambiguous references, a scenario poorly served by existing RAG benchmarks. It introduces a LoCoMo-based dataset modification with time-stamped, long-form dialogues and diverse time-based and ambiguous queries, paired with a retrieval model that combines chain-of-tables meta-data querying and semantic vector search, plus query rewriting to resolve ambiguities. Empirically, the CoTable+Semantic approach with a meta-semantic classifier and disambiguation steps significantly outperforms pure semantic baselines on time-based and time+content tasks, and shows robust gains across two LLMs. The work provides a concrete benchmark and a scalable retrieval architecture that advances memory-augmented conversational agents toward practical, time-aware memory capabilities.

Abstract

There has recently been growing interest in conversational agents with long-term memory which has led to the rapid development of language models that use retrieval-augmented generation (RAG). Until recently, most work on RAG has focused on information retrieval from large databases of texts, like Wikipedia, rather than information from long-form conversations. In this paper, we argue that effective retrieval from long-form conversational data faces two unique problems compared to static database retrieval: 1) time/event-based queries, which requires the model to retrieve information about previous conversations based on time or the order of a conversational event (e.g., the third conversation on Tuesday), and 2) ambiguous queries that require surrounding conversational context to understand. To better develop RAG-based agents that can deal with these challenges, we generate a new dataset of ambiguous and time-based questions that build upon a recent dataset of long-form, simulated conversations, and demonstrate that standard RAG based approaches handle such questions poorly. We then develop a novel retrieval model which combines chained-of-table search methods, standard vector-database retrieval, and a prompting method to disambiguate queries, and demonstrate that this approach substantially improves over current methods at solving these tasks. We believe that this new dataset and more advanced RAG agent can act as a key benchmark and stepping stone towards effective memory augmented conversational agents that can be used in a wide variety of AI applications.

Toward Conversational Agents with Context and Time Sensitive Long-term Memory

TL;DR

This paper tackles the challenge of enabling conversational agents to recall information based on conversational meta-data and ambiguous references, a scenario poorly served by existing RAG benchmarks. It introduces a LoCoMo-based dataset modification with time-stamped, long-form dialogues and diverse time-based and ambiguous queries, paired with a retrieval model that combines chain-of-tables meta-data querying and semantic vector search, plus query rewriting to resolve ambiguities. Empirically, the CoTable+Semantic approach with a meta-semantic classifier and disambiguation steps significantly outperforms pure semantic baselines on time-based and time+content tasks, and shows robust gains across two LLMs. The work provides a concrete benchmark and a scalable retrieval architecture that advances memory-augmented conversational agents toward practical, time-aware memory capabilities.

Abstract

There has recently been growing interest in conversational agents with long-term memory which has led to the rapid development of language models that use retrieval-augmented generation (RAG). Until recently, most work on RAG has focused on information retrieval from large databases of texts, like Wikipedia, rather than information from long-form conversations. In this paper, we argue that effective retrieval from long-form conversational data faces two unique problems compared to static database retrieval: 1) time/event-based queries, which requires the model to retrieve information about previous conversations based on time or the order of a conversational event (e.g., the third conversation on Tuesday), and 2) ambiguous queries that require surrounding conversational context to understand. To better develop RAG-based agents that can deal with these challenges, we generate a new dataset of ambiguous and time-based questions that build upon a recent dataset of long-form, simulated conversations, and demonstrate that standard RAG based approaches handle such questions poorly. We then develop a novel retrieval model which combines chained-of-table search methods, standard vector-database retrieval, and a prompting method to disambiguate queries, and demonstrate that this approach substantially improves over current methods at solving these tasks. We believe that this new dataset and more advanced RAG agent can act as a key benchmark and stepping stone towards effective memory augmented conversational agents that can be used in a wide variety of AI applications.
Paper Structure (23 sections, 3 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 23 sections, 3 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Depiction of our combined tabular and semantic vector-search method.
  • Figure 2: Examples of queries for various types in our dataset.
  • Figure 3: F2 scores for each individual time-based test and the time+content based test. All models use k=10 for semantic search. Error bars show std. of recall and precision across data in each test.
  • Figure 4: Classification accuracy for the meta-semantic classifier on each test set.