Table of Contents
Fetching ...

It's Not Just Labeling -- A Research on LLM Generated Feedback Interpretability and Image Labeling Sketch Features

Baichuan Li, Larry Powell, Tracy Hammond

TL;DR

This paper investigates how free-hand sketch-based labeling, guided by LLM feedback, can improve image annotation accessibility and explainability. It builds a synthetic dataset of sketch strokes via stochastic resampling, extracts sketch recognition features, and evaluates LLM feedback across four prompting strategies using a RAGAS-based metric framework. Results show generally weak links between sketch features and feedback interpretability, and that rubric-based prompts do not consistently enhance performance; few-shot basic prompting often yields better context precision and trustworthiness. The study highlights the potential of LLM-assisted sketch labeling while underscoring the need for real human data, temporal features, and user studies to validate scalability and practicality in diverse labeling scenarios.

Abstract

The quality of training data is critical to the performance of machine learning applications in domains like transportation, healthcare, and robotics. Accurate image labeling, however, often relies on time-consuming, expert-driven methods with limited feedback. This research introduces a sketch-based annotation approach supported by large language models (LLMs) to reduce technical barriers and enhance accessibility. Using a synthetic dataset, we examine how sketch recognition features relate to LLM feedback metrics, aiming to improve the reliability and interpretability of LLM-assisted labeling. We also explore how prompting strategies and sketch variations influence feedback quality. Our main contribution is a sketch-based virtual assistant that simplifies annotation for non-experts and advances LLM-driven labeling tools in terms of scalability, accessibility, and explainability.

It's Not Just Labeling -- A Research on LLM Generated Feedback Interpretability and Image Labeling Sketch Features

TL;DR

This paper investigates how free-hand sketch-based labeling, guided by LLM feedback, can improve image annotation accessibility and explainability. It builds a synthetic dataset of sketch strokes via stochastic resampling, extracts sketch recognition features, and evaluates LLM feedback across four prompting strategies using a RAGAS-based metric framework. Results show generally weak links between sketch features and feedback interpretability, and that rubric-based prompts do not consistently enhance performance; few-shot basic prompting often yields better context precision and trustworthiness. The study highlights the potential of LLM-assisted sketch labeling while underscoring the need for real human data, temporal features, and user studies to validate scalability and practicality in diverse labeling scenarios.

Abstract

The quality of training data is critical to the performance of machine learning applications in domains like transportation, healthcare, and robotics. Accurate image labeling, however, often relies on time-consuming, expert-driven methods with limited feedback. This research introduces a sketch-based annotation approach supported by large language models (LLMs) to reduce technical barriers and enhance accessibility. Using a synthetic dataset, we examine how sketch recognition features relate to LLM feedback metrics, aiming to improve the reliability and interpretability of LLM-assisted labeling. We also explore how prompting strategies and sketch variations influence feedback quality. Our main contribution is a sketch-based virtual assistant that simplifies annotation for non-experts and advances LLM-driven labeling tools in terms of scalability, accessibility, and explainability.

Paper Structure

This paper contains 37 sections, 20 figures, 3 tables.

Figures (20)

  • Figure 1: This diagram shows the ideal flow of sketch-based image labeling system. Users can utilize sketch-based image labeling interface to draw free-hand sketches and receive feedback from an LLM. This research mainly focused on the relationship between labeled images and LLMs, which is highlighted in the red box.
  • Figure 2: This diagram is the methodology overview in this research. We firstly generated the synthetic dataset for labeled images through stochastic resampling, and then generate feedback by inputting four different prompts to the LLM. After that, SR features and LLM evaluation metrics are computed to achieve the quantification goal. Finally, the results were derived by statistical analysis.
  • Figure 3: This figure is an example image used for synthetic dataset generation.
  • Figure 4: This figure is an example image mask used for synthetic dataset generation.
  • Figure 5: This diagram shows the steps to generate synthetic dataset for labeling strokes. Firstly contours of objects of interest were extracted and performed stochastic resampling. Finally SR features were computed.
  • ...and 15 more figures