End-to-End Bangla AI for Solving Math Olympiad Problem Benchmark: Leveraging Large Language Model Using Integrated Approach
H. M. Shadman Tabib, Jaber Ahmed Deedar
TL;DR
This paper addresses Bangla-language math Olympiad problem solving with LLMs. It presents an end-to-end pipeline that combines model selection, dataset augmentation, retrieval-augmented generation (RAG), and tool-integrated reasoning (TIR) with self-consistency to tackle multi-step math tasks in Bangla. It demonstrates that problem categorization, targeted prompting, and iterative reasoning yield measurable gains, with configurations like Qwen-2.5-32B-Instruct-AWQ reaching $77/100$ on a test set and GPT-4o-based TIR setups excelling on the BDMO benchmark ($125/209$). The work highlights the practical potential of combining large, multilingual LLMs with augmented data and retrieval strategies, and outlines directions for improving retrieval quality and domain-specific datasets.
Abstract
This work introduces systematic approach for enhancing large language models (LLMs) to address Bangla AI mathematical challenges. Through the assessment of diverse LLM configurations, fine-tuning with specific datasets, and the implementation of Retrieval-Augmented Generation (RAG), we enhanced the model's reasoning precision in a multilingual setting. Crucial discoveries indicate that customized prompting, dataset augmentation, and iterative reasoning improve the model's efficiency regarding Olympiad-level mathematical challenges.
