Gradient estimation for discrete variables via dependent Monte Carlo samples




Dimitriev, Aleksandar

Journal Title

Journal ISSN

Volume Title



Discrete expectations arise in various machine learning tasks, and we often need to backpropagate the gradient through them. One domain is variational inference, where training discrete latent variable models requires gradient estimates of a high dimensional discrete distribution because we are backpropagating through discrete stochastic layer in a deep neural network. Another important area of research is a permutation or ranking based objective where the objective itself is discrete and non-differentiable. To tackle these problems, we propose ARMS, an antithetic REINFORCE-based Monte Carlo gradient estimator for three different discrete distributions: binary, categorical, and Plackett-Luce, where the last two are generalizations of the previous case. ARMS uses negatively correlated samples produced by a copula for variance reduction, and leverages importance sampling to produce an unbiased estimate. Our approach also generalizes several other estimators. ARMS with two samples reduces to the recently developed DisARM estimator for binary and categorical distributions, and ARMS with independent samples reduces to the strong self-control baseline LOORF/VarGrad. We evaluate ARMS on several different objectives and datasets. We show that ARMS outperforms the state of the art on training variational autoencoders with binary or categorical latent variables, trained using either the evidence lower bound or the multi sample bound. We also compare our approach on a structured prediction task for training stochastic categorical networks. Lastly, we evaluate ARMS for different Plackett-Luce based objectives, which include permutation and ranking losses, with similar results, and we open source the code publicly.


LCSH Subject Headings