OpenCML: End-to-End Framework of Open-world Machine Learning to Learn Unknown Classes Incrementally
Jitendra Parmar, Praveen Singh Thakur
TL;DR
OpenCML addresses the challenge of open-world, incremental text classification by jointly discovering unknown classes and learning them without forgetting. It combines an open-text classifier, BIRCH-based novel-class discovery, SingleRank-based labeling, and a cross-distillation loss built on exemplar memory to enable continual learning. The approach is validated on four NLP datasets, showing strong incremental accuracy and robust open-text classification, with memory-aware improvements and clustering/labeling ablations supporting design choices. The work highlights practical significance for dynamic NLP systems, offering scalable memory management and a pathway toward integrating knowledge transfer and reinforcement learning in open-world continual learning.
Abstract
Open-world machine learning is an emerging technique in artificial intelligence, where conventional machine learning models often follow closed-world assumptions, which can hinder their ability to retain previously learned knowledge for future tasks. However, automated intelligence systems must learn about novel classes and previously known tasks. The proposed model offers novel learning classes in an open and continuous learning environment. It consists of two different but connected tasks. First, it discovers unknown classes in the data and creates novel classes; next, it learns how to perform class incrementally for each new class. Together, they enable continual learning, allowing the system to expand its understanding of the data and improve over time. The proposed model also outperformed existing approaches in open-world learning. Furthermore, it demonstrated strong performance in continuous learning, achieving a highest average accuracy of 82.54% over four iterations and a minimum accuracy of 65.87%.
