Probabilistic neural networks




Ott, Evan Austin

Journal Title

Journal ISSN

Volume Title



Neural networks are flexible models capable of capturing complicated data relationships. However, neural networks are typically trained to make maximum likelihood predictions that ignore uncertainty in the model parameters. Additionally, stochasticity is often not incorporated into predictions. Inspired by Bayesian methodology, this work explores ways of incorporating uncertainty in neural network–based models, whether approximating a Bayesian posterior, formulating an alternative to a Bayesian posterior, or developing generative models inspired by parametric Bayesian models.

First, we explore the impact of different approximations in approximate Bayesian inference by considering probabilistic backpropagation (Hernández-Lobato and Adams, 2015), an approximate method for Bayesian neural networks that uses several Gaussian approximations in an assumed density filtering (Opper, 1999) setting. We explore an alternative approximation using a spike-and-slab distribution, designed to be a more accurate approximation to the true distribution.

Second, we explore the use of alternative notions of a posterior distribution. Nonparametric learning (Lyddon et al., 2019; Fong et al., 2019) is a method that provides principled uncertainty estimates about parameters of interest, while making minimal assumptions about the parameters. However, a naïve approach will scale poorly to large models such as normalizing flows. We show that an approximate implementation, where some parameters are fixed across all samples from the posterior, allows us  to achieve improved predictive performance without incurring excessive computational costs.

Finally, building on results on edge-exchangeable graphs (Crane and Dempsey, 2016; Cai et al., 2016) and generative graph neural networks (e.g., Li et al. (2018)), we propose an edge-based generative graph neural network model. By construction, our model should be able to more easily learn and replicate the structure of sparse graphs, which are common in real-world settings. In this ongoing work, we find that the resulting distributions over graphs are able to capture realistic graph properties in a variety of settings.



LCSH Subject Headings