End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach

H. M. Shadman Tabib; Jaber Ahmed Deedar

End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach

H. M. Shadman Tabib, Jaber Ahmed Deedar

TL;DR

This paper addresses Bangla-language math Olympiad problem solving with LLMs. It presents an end-to-end pipeline that combines model selection, dataset augmentation, retrieval-augmented generation (RAG), and tool-integrated reasoning (TIR) with self-consistency to tackle multi-step math tasks in Bangla. It demonstrates that problem categorization, targeted prompting, and iterative reasoning yield measurable gains, with configurations like Qwen-2.5-32B-Instruct-AWQ reaching $77/100$ on a test set and GPT-4o-based TIR setups excelling on the BDMO benchmark ($125/209$). The work highlights the practical potential of combining large, multilingual LLMs with augmented data and retrieval strategies, and outlines directions for improving retrieval quality and domain-specific datasets.

Abstract

This work introduces systematic approach for enhancing large language models (LLMs) to address Bangla AI mathematical challenges. Through the assessment of diverse LLM configurations, fine-tuning with specific datasets, and the implementation of Retrieval-Augmented Generation (RAG), we enhanced the model's reasoning precision in a multilingual setting. Crucial discoveries indicate that customized prompting, dataset augmentation, and iterative reasoning improve the model's efficiency regarding Olympiad-level mathematical challenges.

End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach

TL;DR

on a test set and GPT-4o-based TIR setups excelling on the BDMO benchmark (

). The work highlights the practical potential of combining large, multilingual LLMs with augmented data and retrieval strategies, and outlines directions for improving retrieval quality and domain-specific datasets.

Abstract

Paper Structure (18 sections, 1 figure, 7 tables)

This paper contains 18 sections, 1 figure, 7 tables.

Introduction
Methodology
Model Selection
Datasets
Preprocessing
Augmentation
Fine-tuning
Model Architecture and Flow
Model Inference and Prompting
Results
Experiments and Discussion
Problem Categorization for Improved Performance
Tailored Prompting for Problem Types
Multilingual Reasoning and Repetitive Querying
Prompt Phrasing and Politeness
...and 3 more sections

Figures (1)

Figure 1: Model Architecture

End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach

TL;DR

Abstract

End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach

Authors

TL;DR

Abstract

Table of Contents

Figures (1)