Variable Selection for Covariate Dependent Dirichlet Process Mixture of Regressions

Abstract

Dirichlet Process Mixture (DPM) models have been increasingly employed tospecify random partition models that take into account possible patterns withinthe covariates. Furthermore, in response to large numbers of covariates, methodsfor selecting the most important covariates have been proposed. Commonly, thecovariates are chosen either for their importance in determining the clustering ofthe observations or for their effect on the level of a response variable (in case aregression model is specified). Typically both strategies involve the specification oflatent indicators that regulate the inclusion of the covariates in the model. Commonexamples involve the use of spike and slab prior distributions. In this work we reviewthe most relevant DPM models that include the covariate information in the inducedpartition of the observations and we focus extensively on available variable selectiontechniques for these models. We highlight the main features of each model anddemonstrate them in simulations