Table of Contents
Fetching ...

CLIPSE -- a minimalistic CLIP-based image search engine for research

Steve Göring

TL;DR

CLIPSE addresses the need for a lightweight, self-hosted image search engine suitable for research on small datasets using OpenCLIP embeddings. It maps both images and text queries into a shared embedding space and computes similarity via a dot product, offering a CLI and a lightweight web interface. The implementation is CPU-only, built in Python, with index construction and querying accessible via build_index.py, query.py, and server.py. Benchmark results indicate near-linear indexing times and tens-of-milliseconds query latencies under warm conditions, supporting CLIPSE's practicality for small datasets and suggesting a path toward distributed deployment for larger scales.

Abstract

A brief overview of CLIPSE, a self-hosted image search engine with the main application of research, is provided. In general, CLIPSE uses CLIP embeddings to process the images and also the text queries. The overall framework is designed with simplicity to enable easy extension and usage. Two benchmark scenarios are described and evaluated, covering indexing and querying time. It is shown that CLIPSE is capable of handling smaller datasets; for larger datasets, a distributed approach with several instances should be considered.

CLIPSE -- a minimalistic CLIP-based image search engine for research

TL;DR

CLIPSE addresses the need for a lightweight, self-hosted image search engine suitable for research on small datasets using OpenCLIP embeddings. It maps both images and text queries into a shared embedding space and computes similarity via a dot product, offering a CLI and a lightweight web interface. The implementation is CPU-only, built in Python, with index construction and querying accessible via build_index.py, query.py, and server.py. Benchmark results indicate near-linear indexing times and tens-of-milliseconds query latencies under warm conditions, supporting CLIPSE's practicality for small datasets and suggesting a path toward distributed deployment for larger scales.

Abstract

A brief overview of CLIPSE, a self-hosted image search engine with the main application of research, is provided. In general, CLIPSE uses CLIP embeddings to process the images and also the text queries. The overall framework is designed with simplicity to enable easy extension and usage. Two benchmark scenarios are described and evaluated, covering indexing and querying time. It is shown that CLIPSE is capable of handling smaller datasets; for larger datasets, a distributed approach with several instances should be considered.

Paper Structure

This paper contains 6 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: CLIPSE command line interface, example query and result.
  • Figure 2: CLIPSE web interface, example query and result.