Table of Contents
Fetching ...

GenQuery: Supporting Expressive Visual Search with Generative Models

Kihoon Son, DaEun Choi, Tae Soo Kim, Young-Ho Kim, Juho Kim

TL;DR

This work proposes GenQuery, a novel system that integrates generative models into the visual search process and enables users to generatively modify images and use these in similarity-based search.

Abstract

Designers rely on visual search to explore and develop ideas in early design stages. However, designers can struggle to identify suitable text queries to initiate a search or to discover images for similarity-based search that can adequately express their intent. We propose GenQuery, a novel system that integrates generative models into the visual search process. GenQuery can automatically elaborate on users' queries and surface concrete search directions when users only have abstract ideas. To support precise expression of search intents, the system enables users to generatively modify images and use these in similarity-based search. In a comparative user study (N=16), designers felt that they could more accurately express their intents and find more satisfactory outcomes with GenQuery compared to a tool without generative features. Furthermore, the unpredictability of generations allowed participants to uncover more diverse outcomes. By supporting both convergence and divergence, GenQuery led to a more creative experience.

GenQuery: Supporting Expressive Visual Search with Generative Models

TL;DR

This work proposes GenQuery, a novel system that integrates generative models into the visual search process and enables users to generatively modify images and use these in similarity-based search.

Abstract

Designers rely on visual search to explore and develop ideas in early design stages. However, designers can struggle to identify suitable text queries to initiate a search or to discover images for similarity-based search that can adequately express their intent. We propose GenQuery, a novel system that integrates generative models into the visual search process. GenQuery can automatically elaborate on users' queries and surface concrete search directions when users only have abstract ideas. To support precise expression of search intents, the system enables users to generatively modify images and use these in similarity-based search. In a comparative user study (N=16), designers felt that they could more accurately express their intents and find more satisfactory outcomes with GenQuery compared to a tool without generative features. Furthermore, the unpredictability of generations allowed participants to uncover more diverse outcomes. By supporting both convergence and divergence, GenQuery led to a more creative experience.
Paper Structure (42 sections, 11 figures, 1 table)

This paper contains 42 sections, 11 figures, 1 table.

Figures (11)

  • Figure 1: The overall visual search process in an exemplar-based search tool: Designers initiate their visual search with text-based search to find an image that they want to explore. When the designers find an image they would look for, they perform an image-based search by clicking the image. After seeing sufficient similar results with the initial input image (in the middle of an image-based search), they try to explore diverse images through the image-based search by clicking images that are different from the ones seen so far. In this process, when the designers feel they are continuously searching for similar images, they return to the text-based search to explore other images.
  • Figure 2: Interface of GenQuery. GenQuery shows the image search results as a gallery form. (a) Text prompt input box for text-based search: User can input a text description for the desired image here; (b) Clickable image for image-based search: An image in the gallery is clickable to provoke the image-based search. When the image is clicked, GenQuery shows similar images to the clicked one at the bottom of the gallery; (c) Like button: The user can click the like button to save the design into the side panel; (d) Generation button: To edit one of the searched images for generating a new input, the user can click the marble emoji left top of image card. When the user clicks this button, the generation panel pops out below; (e) Show more button: This button is clicked when the user wants to see more search results
  • Figure 3: Query concretization, which allows the user to concretize the user's initial abstract search query through LLM zero shot prompting: (a) Suggested search query. User can swap their query by pressing the tab key; (b) Images searched by the suggested query are shown below in the search bar. Each image is clickable to process an image-based search; (c) GenQuery provides five suggestions at a time, and the user can explore other suggestions by pressing the up and down arrow keys.
  • Figure 4: Image-based image modification, which allows the user to express a clear search intent through reference-based editing and search: (a) An image the user wants to modify. In this panel, the user can click or drag the areas in the image that he/she wants to modify. Currently, the blue ice mountain illustration has been selected; (b) An image the user wants to refer to in the editing process. The user can select a reference image from the search results or the saved design list; (c) The generation output. The user can do a regenerate or an image-based search with this generated result. The (a1) and (c1) show the difference in search results searched by the original image (a) and generated image (c).
  • Figure 5: Keyword-based image modification, which allows the user to diversify their search intent through keywords-based editing and search: (a) An image the user wants to modify. Currently, the green forest illustration part has been selected; (b) Keywords suggestion panel based on the search history (e.g., search queries and saved image descriptions). GenQuery suggest the keywords similar(b1) and new (b2) to previous search history. Also, the user can make their own keyword in b3; (c) The generation output. The user can do a regenerate or an image-based search with this generated result. The (a1) and (c1) show the difference in search results searched by the original image (a) and generated image (c).
  • ...and 6 more figures