Machine Learning and Data Analysis Using Posets: A Survey
Arnauld Mesinga Mwafise
TL;DR
The survey addresses how partially ordered sets (posets) and lattice theory underpin data analysis and machine learning, with formal concept analysis (FCA) as a central thread. It categorizes approaches into ML on posets, FCA-based learning, clustering, and exploratory/descriptive data analysis across diverse domains, emphasizing interpretability and the handling of incomparabilities. A key contribution is the taxonomy of poset-based methods, including depth-based model comparison and applications in NLP, time-series, ranking, and environment/socio-economic analyses, along with practical datasets and software. By outlining four forward-looking directions that integrate topology, GNNs on lattices, and FCA-driven ontology learning, the paper highlights posets as a flexible framework for non-Euclidean data and complex structured analyses, with meet and join operations denoted by $\wedge$ and $\vee$.
Abstract
Posets are discrete mathematical structures which are ubiquitous in a broad range of data analysis and machine learning applications. Research connecting posets to the data science domain has been ongoing for many years. In this paper, a comprehensive review of a wide range of studies on data analysis and machine learning using posets are examined in terms of their theory, algorithms and applications. In addition, the applied lattice theory domain of formal concept analysis will also be highlighted in terms of its machine learning applications.
