Table of Contents
Fetching ...

Open-world machine learning: A review and new outlooks

Fei Zhu, Shijie Ma, Zhen Cheng, Xu-Yao Zhang, Zhaoxiang Zhang, Dacheng Tao, Cheng-Lin Liu

TL;DR

This paper provides a comprehensive survey of open-world learning (OWL), embedding three interdependent tasks—unknown rejection, novel class discovery, and continual learning—into a unified framework. It offers a detailed taxonomy, reviews hundreds of methods across OOD/OSR, NCD/GCD, and CL, and discusses theoretical and empirical advances, benchmarks, and evaluation metrics. The authors highlight core challenges, summarize datasets and performance trends, and propose future directions including unified open-world systems, large-model adaptation, structured-data OWL, and brain-inspired unlearning. By synthesizing current progress and gaps, the paper aims to accelerate the development of resilient, autonomous AI systems capable of learning continually in dynamic real-world environments.

Abstract

Machine learning has achieved remarkable success in many applications. However, existing studies are largely based on the closed-world assumption, which assumes that the environment is stationary, and the model is fixed once deployed. In many real-world applications, this fundamental and rather naive assumption may not hold because an open environment is complex, dynamic, and full of unknowns. In such cases, rejecting unknowns, discovering novelties, and then continually learning them, could enable models to be safe and evolve continually as biological systems do. This article presents a holistic view of open-world machine learning by investigating unknown rejection, novelty discovery, and continual learning in a unified paradigm. The challenges, principles, and limitations of current methodologies are discussed in detail. Furthermore, widely used benchmarks, metrics, and performances are summarized. Finally, we discuss several potential directions for further progress in the field. By providing a comprehensive introduction to the emerging open-world machine learning paradigm, this article aims to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.

Open-world machine learning: A review and new outlooks

TL;DR

This paper provides a comprehensive survey of open-world learning (OWL), embedding three interdependent tasks—unknown rejection, novel class discovery, and continual learning—into a unified framework. It offers a detailed taxonomy, reviews hundreds of methods across OOD/OSR, NCD/GCD, and CL, and discusses theoretical and empirical advances, benchmarks, and evaluation metrics. The authors highlight core challenges, summarize datasets and performance trends, and propose future directions including unified open-world systems, large-model adaptation, structured-data OWL, and brain-inspired unlearning. By synthesizing current progress and gaps, the paper aims to accelerate the development of resilient, autonomous AI systems capable of learning continually in dynamic real-world environments.

Abstract

Machine learning has achieved remarkable success in many applications. However, existing studies are largely based on the closed-world assumption, which assumes that the environment is stationary, and the model is fixed once deployed. In many real-world applications, this fundamental and rather naive assumption may not hold because an open environment is complex, dynamic, and full of unknowns. In such cases, rejecting unknowns, discovering novelties, and then continually learning them, could enable models to be safe and evolve continually as biological systems do. This article presents a holistic view of open-world machine learning by investigating unknown rejection, novelty discovery, and continual learning in a unified paradigm. The challenges, principles, and limitations of current methodologies are discussed in detail. Furthermore, widely used benchmarks, metrics, and performances are summarized. Finally, we discuss several potential directions for further progress in the field. By providing a comprehensive introduction to the emerging open-world machine learning paradigm, this article aims to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
Paper Structure (30 sections, 25 equations, 14 figures, 10 tables)

This paper contains 30 sections, 25 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Illustration of open-world machine learning scenarios. In intelligent driving, robot manipulation, medical diagnosis and Chatbot scenarios, the environments are open, dynamic and complex. Open-world machine learning enables the model to learn and evolve from streaming data safely.
  • Figure 2: Illustrations of the life cycle of a learning system in the open-world applications. (A) Humans continually learn new knowledge throughout their lives and maintain/use previous knowledge, becoming increasingly smarter and more skillful over time. (B) Open-world machine learning aims to build a human-like system that can transfer and consolidate knowledge continually during deployment. (C) An open-world learning paradigm mainly includes three parts, i.e., unknown rejection, novel class discovery and continual learning.
  • Figure 3: Example applications of open-world machine learning systems. (A) Autonomous driving can leverage open-world learning to handle unknown objects and evolving environments, enabling safe navigation, adaptation to dynamic environments. (B) Medical diagnosis can encounter diverse input distributions from clients, and open-world learning enables more reliable clinical decisions and responsiveness. (C) AI chatbot continually learns from user-specific information and improves the dialogue behavior over time.
  • Figure 4: The evolutionary tree of unknown rejection methods.
  • Figure 5: Illustrations of different types of methods of OOD detection (first row) and Open-Set Recognition (OSR, second row). In OOD detection, methods are divided into three types, i.e., (a), (b) and (c), according to training and inference strategies. For OSR, methods are also divided into three types, i.e., (d), (e) and (f), regarding modeling perspectives in OSR. In OSR, "loss D", "loss G" and "loss H" denote loss functions for discriminative, generative and hybrid models, respectively. Here, $f$ and $g$ denote the feature extractor and generator, respectively.
  • ...and 9 more figures