An effective estimation of multivariate density functions using extended-beta kernels with Bayesian adaptive bandwidths
Sobom M. Somé, Célestin C. Kokonendji, Francial G. B. Libengué Dobélé-Kpoka
TL;DR
This paper develops a unified multivariate density estimator based on the multiple extended-beta kernel (MEBK) on compact supports, defining $\widehat{f}_{n}(\boldsymbol{x})=\frac{1}{n}\sum_{i=1}^{n}\prod_{j=1}^{d} EB_{x_j,h_j,a_j,b_j}(X_{ij})$ with $EB_{x,h,a,b}$ the univariate extended-beta kernel. It establishes bias, variance, and asymptotic normality under suitable smoothness and bandwidth conditions, and introduces a Bayesian adaptive bandwidth selector using independent inverse-gamma priors $IG(\alpha,\beta_\ell)$, yielding Bayes estimators $\widetilde{\boldsymbol{h}}_i=\mathbb{E}(\boldsymbol{h}_i|\mathbf{X}_i)$ and practical choices like $\alpha=n^{2/5}$ and $\beta_\ell=1$, along with an automatic support estimator $\widehat{\mathbb{T}}_d$. Through extensive simulations (univariate and multivariate) and real-data applications (cholesterol, Old Faithful, and student marks), the MEBK with Bayesian bandwidths demonstrates competitive or superior smoothing performance measured by ISE and log-likelihood compared to Gaussian and gamma kernels, especially near boundaries. The work provides explicit theoretical results, practical bandwidth rules, and empirical evidence of flexibility and universality in density estimation across bounded and unbounded domains, with potential for software implementation and future methodological extensions such as combined MEBK variants.
Abstract
Multivariate kernel density estimations have received much spate of interest. In addition to conventional methods of (non-)classical associated-kernels for (un)bounded densities and bandwidth selections, the multiple extended-beta kernel (MEBK) estimators with Bayesian adaptive bandwidths are invested to gain a deeper and better insight into the estimation of multivariate density functions. Being unimodal, the univariate extended-beta smoother has an adaptable compact support which is suitable for each dataset, always limited. The support of the density MBEK estimator can be known or estimated by extreme values. Thus, asymptotical properties for the (non-)normalized estimators are established. Explicit and general choices of bandwidths using the flexible Bayesian adaptive method are provided. Behavioural analyses, specifically undertaken on the sensitive edges of the estimator support, are studied and compared to Gaussian and gamma kernel estimators. Finally, simulation studies and three applications on original and usual real-data sets of the proposed method yielded very interesting advantages with respect to its flexibility as well as its universality.
