Enhancing Burmese News Classification with Kolmogorov-Arnold Network Head Fine-tuning
Thura Aung, Eaint Kay Khaing Kyaw, Ye Kyaw Thu, Thazin Myint Oo, Thepchai Supnithi
TL;DR
The paper addresses Burmese news sentence classification in low-resource settings by freezing the encoder and finetuning only the classification head. It compares three Kolmogorov–Arnol d Network heads—FourierKAN, EfficientKAN, and FasterKAN—against a standard MLP across static and contextual embeddings, showing that KAN heads, particularly EfficientKAN with fastText, can achieve superior or competitive F1 scores. The study provides a detailed efficiency versus accuracy analysis, highlighting FasterKAN as a strong speed-accuracy trade-off and revealing that embedding choice dominates performance. This work suggests KAN heads are a promising, lightweight alternative for low-resource NLP tasks and motivates broader adoption and extension to other languages and tasks.
Abstract
In low-resource languages like Burmese, classification tasks often fine-tune only the final classification layer, keeping pre-trained encoder weights frozen. While Multi-Layer Perceptrons (MLPs) are commonly used, their fixed non-linearity can limit expressiveness and increase computational cost. This work explores Kolmogorov-Arnold Networks (KANs) as alternative classification heads, evaluating Fourier-based FourierKAN, Spline-based EfficientKAN, and Grid-based FasterKAN-across diverse embeddings including TF-IDF, fastText, and multilingual transformers (mBERT, Distil-mBERT). Experimental results show that KAN-based heads are competitive with or superior to MLPs. EfficientKAN with fastText achieved the highest F1-score (0.928), while FasterKAN offered the best trade-off between speed and accuracy. On transformer embeddings, EfficientKAN matched or slightly outperformed MLPs with mBERT (0.917 F1). These findings highlight KANs as expressive, efficient alternatives to MLPs for low-resource language classification.
