Title
“Bayesian Principal Curve Clustering by Normalized Generalized Gamma Mixture Models”
Abstract
In this work we are interested in clustering data whose support is “curved”. For this purpose, we will follow a Bayesian nonparametric approach. First of all we will define a new class of random probability measures, approximating the wellknown normalized generalized gamma (NGG) process. Our approximation relies on the representation of NGG process as discrete measures where the weights are obtained by normalization of the jumps of a Poisson process. In our approximation only unnormalized jumps larger than a threshold $epsilon$ will be considered; as a consequence the number of jumps of this new prior, called $epsilon$NG process, is a.s. finite. We will assume the $epsilon$NGG process as the mixing measure in a mixture model for density and cluster estimation. Moreover, as kernel of our mixture, we will consider a general/flexible class of distributions, such that they can model data from clusters with nonstandard shape. To this end, we extend the definition of principal curve given in Tibshirani~1992 into a Bayesian framework. As an application we will consider the detection of seismic faults using data coming from Italian earthquake catalogues.
This is a joint work with Alessandra Guglielmi from Politecnico of Milano
