Demonstration of MaskSearch: Efficiently Querying Image Masks for Machine Learning Workflows
Lindsey Linxi Wei, Chung Yik Edward Yeung, Hongjian Yu, Jingchuan Zhou, Dong He, Magdalena Balazinska
TL;DR
MaskSearch addresses the need to query image–mask collections by mask properties to streamline machine-learning workflows. It introduces a Cumulative Histogram Index ($CHI$) and a filter-verification execution framework to efficiently evaluate $CP(mask, roi, (lv, uv))$ predicates, supporting Filter, Top-$k$, and $MASK oMASK$ aggregation queries. The authors provide a GUI-driven interface that hides SQL details and demonstrates scenarios for model debugging, adversarial-detection, and human–model attention alignment, achieving substantial speedups over baseline approaches. The work offers a practical tool for rapid, mask-based retrieval and analysis in image-centric ML pipelines, with direct impact on model debugging and data-centric improvements.
Abstract
We demonstrate MaskSearch, a system designed to accelerate queries over databases of image masks generated by machine learning models. MaskSearch formalizes and accelerates a new category of queries for retrieving images and their corresponding masks based on mask properties, which support various applications, from identifying spurious correlations learned by models to exploring discrepancies between model saliency and human attention. This demonstration makes the following contributions:(1) the introduction of MaskSearch's graphical user interface (GUI), which enables interactive exploration of image databases through mask properties, (2) hands-on opportunities for users to explore MaskSearch's capabilities and constraints within machine learning workflows, and (3) an opportunity for conference attendees to understand how MaskSearch accelerates queries over image masks.
