Table of Contents
Fetching ...

Centrality Estimators for Probability Density Functions

Djemel Ziou

TL;DR

The paper introduces centrality-based estimators (C-estimators) for probability density fitting by defining Hölder and Lehmer centralities, linking maximum centrality to a generalized likelihood that relaxes the IID assumption. It analyzes mathematical properties, data-selection semantics, and first-order conditions for estimation, highlighting that standard MLE is a special case when $\alpha\to 0$. Through theoretical insights and a case study on exponential PDFs and DCT coefficient histograms, it demonstrates that optimal fits often occur at nonzero $\alpha$, illustrating robustness and adaptability beyond classical likelihood. The work proposes practical accuracy measures, including residual-based errors and observed C-Fisher information, to gauge estimator performance and uncertainty, with implications for machine learning, data mining, and signal processing tasks that rely on accurate PDF fitting.

Abstract

In this report, we explore the data selection leading to a family of estimators maximizing a centrality. The family allows a nice properties leading to accurate and robust probability density function fitting according to some criteria we define. We establish a link between the centrality estimator and the maximum likelihood, showing that the latter is a particular case. Therefore, a new probability interpretation of Fisher maximum likelihood is provided. We will introduce and study two specific centralities that we have named Hölder and Lehmer estimators. A numerical simulation is provided showing the effectiveness of the proposed families of estimators opening the door to development of new concepts and algorithms in machine learning, data mining, statistics, and data analysis.

Centrality Estimators for Probability Density Functions

TL;DR

The paper introduces centrality-based estimators (C-estimators) for probability density fitting by defining Hölder and Lehmer centralities, linking maximum centrality to a generalized likelihood that relaxes the IID assumption. It analyzes mathematical properties, data-selection semantics, and first-order conditions for estimation, highlighting that standard MLE is a special case when . Through theoretical insights and a case study on exponential PDFs and DCT coefficient histograms, it demonstrates that optimal fits often occur at nonzero , illustrating robustness and adaptability beyond classical likelihood. The work proposes practical accuracy measures, including residual-based errors and observed C-Fisher information, to gauge estimator performance and uncertainty, with implications for machine learning, data mining, and signal processing tasks that rely on accurate PDF fitting.

Abstract

In this report, we explore the data selection leading to a family of estimators maximizing a centrality. The family allows a nice properties leading to accurate and robust probability density function fitting according to some criteria we define. We establish a link between the centrality estimator and the maximum likelihood, showing that the latter is a particular case. Therefore, a new probability interpretation of Fisher maximum likelihood is provided. We will introduce and study two specific centralities that we have named Hölder and Lehmer estimators. A numerical simulation is provided showing the effectiveness of the proposed families of estimators opening the door to development of new concepts and algorithms in machine learning, data mining, statistics, and data analysis.
Paper Structure (9 sections, 26 equations, 3 figures)

This paper contains 9 sections, 26 equations, 3 figures.

Figures (3)

  • Figure 1: First row: The H-C calculated using the histogram (d), critical points, and maximums. Second row: a histogram, central observations explained in P7-P9, and H-Fisher at the maximums.
  • Figure 2: First row: The L-C calculated using the histogram in \ref{['Critic']}.d., critical points, and maximums. Second row: the critical points $\theta$ of Hölder (solid) and Lehmer (dash) centralities as function of $\alpha$ drawn using the implicit function in Eq. \ref{['ExpFirstCond']}.
  • Figure 3: Fitting of 405 histograms of DC coefficients. a) The H-estimator $\theta$ for each image. b) The histogram of the $\alpha$ selected according to the residual error.