On topological and algebraic structures of categorical random variables
Inocencio Ortiz, Santiago Gómez-Guerrero, Christian E. Schaerer
TL;DR
The paper develops a rigorous topological-algebraic framework for categorical random variables by using Symmetric Uncertainty (SU) to define a normalized distance on an indiscernibility quotient. It proves that a natural joint operation induces a commutative monoid on this quotient space and that this algebraic structure is continuous with respect to the SU-based topology. Key contributions include establishing a non-discrete metric topology, a well-defined SU on equivalence classes, and a contractive compatibility between the algebraic and topological structures. These results enable principled, non-parametric similarity assessments and compositional operations on qualitative variables, with proposed extensions to multivariate SU (MSU).
Abstract
Based on entropy and symmetrical uncertainty (SU), we define a metric for categorical random variables and show that this metric can be promoted into an appropriate quotient space of categorical random variables. Moreover, we also show that there is a natural commutative monoid structure in the same quotient space, which is compatible with the topology induced by the metric, in the sense that the monoid operation is continuous.
