Distributed inference in Bayesian nonparametric models using partially collapsed MCMC
MetadataShow full item record
Bayesian nonparametric based models are an elegant way for discovering underlying latent features within a data set, but inference in such models can be slow. Inferring latent components using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which is usually slow. We take advantage of the fact that the latent components are conditionally independent under the given stochastic process (we apply our technique to the Dirichlet process and the Indian buffet process). Because of this conditional independence, we can partition the latent components into two parts: one part containing only the finitely many instantiated components and the other part containing the infinite tail of uninstantiated components. For the finite partition, parallel inference is simple given the instantiation of components. But for the infinite tail, performing uncollapsed MCMC leads to poor mixing and thus we collapse out the components. The resulting hybrid sampler, while being parallel, produces samples asymptotically from the true posterior.