Table of Contents
Fetching ...

OSPC: Artificial VLM Features for Hateful Meme Detection

Peter Grönquist

TL;DR

The paper addresses hateful meme detection in Singapore's multilingual online landscape under resource constraints. It combines OCR-based text extraction, cross-lingual understanding, and a quantized Vision-Language Model with task-adapted prompts to produce an artificial last-layer score for prediction. It reports AUROC up to 0.76 on the OSPC test set, demonstrating that domain-specific quantization and prompt-based encodings can yield competitive performance with reduced computational demands. The work highlights practical strategies for deploying multilingual, vision-language safety systems on limited hardware and data, with potential applicability to private or restricted-weight models.

Abstract

The digital revolution and the advent of the world wide web have transformed human communication, notably through the emergence of memes. While memes are a popular and straightforward form of expression, they can also be used to spread misinformation and hate due to their anonymity and ease of use. In response to these challenges, this paper introduces a solution developed by team 'Baseline' for the AI Singapore Online Safety Prize Challenge. Focusing on computational efficiency and feature engineering, the solution achieved an AUROC of 0.76 and an accuracy of 0.69 on the test dataset. As key features, the solution leverages the inherent probabilistic capabilities of large Vision-Language Models (VLMs) to generate task-adapted feature encodings from text, and applies a distilled quantization tailored to the specific cultural nuances present in Singapore. This type of processing and fine-tuning can be adapted to various visual and textual understanding and classification tasks, and even applied on private VLMs such as OpenAI's GPT. Finally it can eliminate the need for extensive model training on large GPUs for resource constrained applications, also offering a solution when little or no data is available.

OSPC: Artificial VLM Features for Hateful Meme Detection

TL;DR

The paper addresses hateful meme detection in Singapore's multilingual online landscape under resource constraints. It combines OCR-based text extraction, cross-lingual understanding, and a quantized Vision-Language Model with task-adapted prompts to produce an artificial last-layer score for prediction. It reports AUROC up to 0.76 on the OSPC test set, demonstrating that domain-specific quantization and prompt-based encodings can yield competitive performance with reduced computational demands. The work highlights practical strategies for deploying multilingual, vision-language safety systems on limited hardware and data, with potential applicability to private or restricted-weight models.

Abstract

The digital revolution and the advent of the world wide web have transformed human communication, notably through the emergence of memes. While memes are a popular and straightforward form of expression, they can also be used to spread misinformation and hate due to their anonymity and ease of use. In response to these challenges, this paper introduces a solution developed by team 'Baseline' for the AI Singapore Online Safety Prize Challenge. Focusing on computational efficiency and feature engineering, the solution achieved an AUROC of 0.76 and an accuracy of 0.69 on the test dataset. As key features, the solution leverages the inherent probabilistic capabilities of large Vision-Language Models (VLMs) to generate task-adapted feature encodings from text, and applies a distilled quantization tailored to the specific cultural nuances present in Singapore. This type of processing and fine-tuning can be adapted to various visual and textual understanding and classification tasks, and even applied on private VLMs such as OpenAI's GPT. Finally it can eliminate the need for extensive model training on large GPUs for resource constrained applications, also offering a solution when little or no data is available.
Paper Structure (7 sections, 1 figure)

This paper contains 7 sections, 1 figure.

Figures (1)

  • Figure 1: Architecture of the harmfulness classifier: Given an image, resized to fit the models and space constraints, it first constructs a prompt for the VLM model by using OCR and translation models. Then given the prompt plus the image, the quantized VLM model, llava v1.6 32B, predicts exactly one token that follows '0.'. The 10 probabilities of each following token allowed in the grammar: the numbers [0-9], are then multiplied with their respective token, aggregated and normalized, to produce the final probability.