FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

Liliia Bogdanova; Shiran Sun; Lifeng Han; Natalia Amat Lefort; Flor Miriam Plaza-del-Arco

FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

Liliia Bogdanova, Shiran Sun, Lifeng Han, Natalia Amat Lefort, Flor Miriam Plaza-del-Arco

TL;DR

This system paper describes the participants' participation in the SemEval-2025 Task-7 ``Everyday Knowledge Across Diverse Languages and Cultures'' and their creation of their own culturally aware knowledge base (CulKBs).

Abstract

This system paper describes our participation in the SemEval-2025 Task-7 ``Everyday Knowledge Across Diverse Languages and Cultures''. We attended two subtasks, i.e., Track 1: Short Answer Questions (SAQ), and Track 2: Multiple-Choice Questions (MCQ). The methods we used are retrieval augmented generation (RAGs) with open-sourced smaller LLMs (OS-sLLMs). To better adapt to this shared task, we created our own culturally aware knowledge base (CulKBs) by extracting Wikipedia content using keyword lists we prepared. We extracted both culturally-aware wiki-text and country-specific wiki-summary. In addition to the local CulKBs, we also have one system integrating live online search output via DuckDuckGo. Towards better privacy and sustainability, we aimed to deploy smaller LLMs (sLLMs) that are open-sourced on the Ollama platform. We share the prompts we developed using refinement techniques and report the learning curve of such prompts. The tested languages are English, Spanish, and Chinese for both tracks. Our resources and codes are shared via https://github.com/aaronlifenghan/FLANS-2026

FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

TL;DR

Abstract

Paper Structure (23 sections, 4 figures, 2 tables)

This paper contains 23 sections, 4 figures, 2 tables.

Introduction
Related Work
Methodology
Language and Model Selection
Prompt Development
Knowledge Base Constructions
Creating Pseudo Ground Truth
Language Routing
System Development and Validation
Experimental and Evaluation Setup
Ablation Study of Prompts
Submission to SemEval-Test
Conclusions and Future Work
Disclaimer
RAG-base and RAG-web
...and 8 more sections

Figures (4)

Figure 1: RAG pipeline. italic indicating variations
Figure 2: RAG-base Learning curves of three prompt ablation averaged over three-languages (en, es, zh).
Figure 3: RAG-base pipeline using smaller LLMs favoring Gemma3.4b - keywords based KE - then land to KB
Figure 4: RAG-web pipeline using mistral:7b and deepseek-llm:67b (bigger) - Conditional adaptive RAG

FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

TL;DR

Abstract

FLANS at SemEval-2026 Task 7: RAG with Open-Sourced Smaller LLMs for Everyday Knowledge Across Diverse Languages and Cultures

Authors

TL;DR

Abstract

Table of Contents

Figures (4)