Table of Contents
Fetching ...

Designing Interfaces for Multimodal Vector Search Applications

Owen Pendrigh Elliott, Tom Hamer, Jesse Clark

TL;DR

Novel capabilities of multimodal vector search applications utilising CLIP models are explored and implementations and design patterns which better allow users to express their information needs and effectively interact with these systems in an information retrieval context are presented.

Abstract

Multimodal vector search offers a new paradigm for information retrieval by exposing numerous pieces of functionality which are not possible in traditional lexical search engines. While multimodal vector search can be treated as a drop in replacement for these traditional systems, the experience can be significantly enhanced by leveraging the unique capabilities of multimodal search. Central to any information retrieval system is a user who expresses an information need, traditional user interfaces with a single search bar allow users to interact with lexical search systems effectively however are not necessarily optimal for multimodal vector search. In this paper we explore novel capabilities of multimodal vector search applications utilising CLIP models and present implementations and design patterns which better allow users to express their information needs and effectively interact with these systems in an information retrieval context.

Designing Interfaces for Multimodal Vector Search Applications

TL;DR

Novel capabilities of multimodal vector search applications utilising CLIP models are explored and implementations and design patterns which better allow users to express their information needs and effectively interact with these systems in an information retrieval context are presented.

Abstract

Multimodal vector search offers a new paradigm for information retrieval by exposing numerous pieces of functionality which are not possible in traditional lexical search engines. While multimodal vector search can be treated as a drop in replacement for these traditional systems, the experience can be significantly enhanced by leveraging the unique capabilities of multimodal search. Central to any information retrieval system is a user who expresses an information need, traditional user interfaces with a single search bar allow users to interact with lexical search systems effectively however are not necessarily optimal for multimodal vector search. In this paper we explore novel capabilities of multimodal vector search applications utilising CLIP models and present implementations and design patterns which better allow users to express their information needs and effectively interact with these systems in an information retrieval context.
Paper Structure (14 sections, 4 equations, 9 figures, 3 algorithms)

This paper contains 14 sections, 4 equations, 9 figures, 3 algorithms.

Figures (9)

  • Figure 1: Multiple search fields for query refinement.
  • Figure 2: Iterative refinement of search results with multi-part queries. Data presented here is from an online furniture retailer.
  • Figure 3: Query refinement to remove low quality items from search results.
  • Figure 4: Query prompting with predefined prompts. In this example we use "A black and white, monochromatic image of a <QUERY>".
  • Figure 5: Online query expansion via semantic filtering with LLM generated expansion terms from user preferences.
  • ...and 4 more figures