Table of Contents
Fetching ...

Hierarchical mixture of discriminative Generalized Dirichlet classifiers

Elvis Togban, Djemel Ziou

TL;DR

This work addresses discriminative classification for compositional data by modeling class-conditional distributions with the Generalized Dirichlet and using its posterior as a classifier (DGD). It extends to a hierarchical mixture of DGD experts (HMGD) in a tree-structured mixture-of-experts framework, enabling region-specific discrimination on the simplex. A novel variational upper-bound for the GD mixture enables tractable maximum-likelihood learning, addressing intractability in prior approaches. Empirical results on UCI datasets, spam detection, and color space identification show competitive accuracy and robustness, with HMGD often outperforming DGD and offering a principled simplex-based alternative to alpha-transformation preprocessing. Overall, the paper advances discriminative, simplex-constrained classification with a scalable hierarchical framework and tractable inference.

Abstract

This paper presents a discriminative classifier for compositional data. This classifier is based on the posterior distribution of the Generalized Dirichlet which is the discriminative counterpart of Generalized Dirichlet mixture model. Moreover, following the mixture of experts paradigm, we proposed a hierarchical mixture of this classifier. In order to learn the models parameters, we use a variational approximation by deriving an upper-bound for the Generalized Dirichlet mixture. To the best of our knownledge, this is the first time this bound is proposed in the literature. Experimental results are presented for spam detection and color space identification.

Hierarchical mixture of discriminative Generalized Dirichlet classifiers

TL;DR

This work addresses discriminative classification for compositional data by modeling class-conditional distributions with the Generalized Dirichlet and using its posterior as a classifier (DGD). It extends to a hierarchical mixture of DGD experts (HMGD) in a tree-structured mixture-of-experts framework, enabling region-specific discrimination on the simplex. A novel variational upper-bound for the GD mixture enables tractable maximum-likelihood learning, addressing intractability in prior approaches. Empirical results on UCI datasets, spam detection, and color space identification show competitive accuracy and robustness, with HMGD often outperforming DGD and offering a principled simplex-based alternative to alpha-transformation preprocessing. Overall, the paper advances discriminative, simplex-constrained classification with a scalable hierarchical framework and tractable inference.

Abstract

This paper presents a discriminative classifier for compositional data. This classifier is based on the posterior distribution of the Generalized Dirichlet which is the discriminative counterpart of Generalized Dirichlet mixture model. Moreover, following the mixture of experts paradigm, we proposed a hierarchical mixture of this classifier. In order to learn the models parameters, we use a variational approximation by deriving an upper-bound for the Generalized Dirichlet mixture. To the best of our knownledge, this is the first time this bound is proposed in the literature. Experimental results are presented for spam detection and color space identification.
Paper Structure (15 sections, 40 equations, 5 figures, 5 tables, 2 algorithms)

This paper contains 15 sections, 40 equations, 5 figures, 5 tables, 2 algorithms.

Figures (5)

  • Figure 1: Compositional data illustration.
  • Figure 2: Illustration of the HMGD splitting process and the final decision boundary. The solid curve is the regions (resp. sub-regions or classes) boundary. In "a)" each split is performed by a DGD model.
  • Figure 3: Log-Likelihood of two Beta mixture and its upper-bound generated at the red circle point.
  • Figure 4: Matthew Correlation Coefficient according to the results of table \ref{['tab:Performance-vs-old']}.
  • Figure 5: Matthew Correlation Coefficient according to the results of table \ref{['tab:Performance-vs-spam']}.