Multi-level Product Category Prediction through Text Classification
Wesley Ferreira Maia, Angelo Carmignani, Gabriel Bortoli, Lucas Maretti, David Luz, Daniel Camilo Fuentes Guzman, Marcos Jardel Henriques, Francisco Louzada Neto
TL;DR
This work tackles multi-level product category prediction in the Brazilian retail domain by comparing LSTM and BERT models on a Portuguese dataset. It combines data augmentation via web scraping and the focal loss to address class imbalance, achieving strong performance across segments, categories, subcategories, and products—particularly with BERT on detailed categories. The study demonstrates that carefully chosen embeddings and augmentation strategies significantly enhance NLP performance in retail, offering practical guidance for deployment. Overall, the findings validate the value of transformer-based models with targeted preprocessing for hierarchical product classification in real-world retail settings.
Abstract
This article investigates applying advanced machine learning models, specifically LSTM and BERT, for text classification to predict multiple categories in the retail sector. The study demonstrates how applying data augmentation techniques and the focal loss function can significantly enhance accuracy in classifying products into multiple categories using a robust Brazilian retail dataset. The LSTM model, enriched with Brazilian word embedding, and BERT, known for its effectiveness in understanding complex contexts, were adapted and optimized for this specific task. The results showed that the BERT model, with an F1 Macro Score of up to $99\%$ for segments, $96\%$ for categories and subcategories and $93\%$ for name products, outperformed LSTM in more detailed categories. However, LSTM also achieved high performance, especially after applying data augmentation and focal loss techniques. These results underscore the effectiveness of NLP techniques in retail and highlight the importance of the careful selection of modelling and preprocessing strategies. This work contributes significantly to the field of NLP in retail, providing valuable insights for future research and practical applications.
