Gradient estimation for discrete variables via dependent Monte Carlo samples

dc.contributor.advisorZhou, Mingyuan (Assistant professor)
dc.contributor.committeeMemberGao, Rui
dc.contributor.committeeMemberSager, Thomas
dc.contributor.committeeMemberWilliamson, Sinead
dc.creatorDimitriev, Aleksandar
dc.creator.orcid0000-0001-6442-349X
dc.date.accessioned2022-08-29T22:34:39Z
dc.date.available2022-08-29T22:34:39Z
dc.date.created2022-05
dc.date.issued2022-05-06
dc.date.submittedMay 2022
dc.date.updated2022-08-29T22:34:40Z
dc.description.abstractDiscrete expectations arise in various machine learning tasks, and we often need to backpropagate the gradient through them. One domain is variational inference, where training discrete latent variable models requires gradient estimates of a high dimensional discrete distribution because we are backpropagating through discrete stochastic layer in a deep neural network. Another important area of research is a permutation or ranking based objective where the objective itself is discrete and non-differentiable. To tackle these problems, we propose ARMS, an antithetic REINFORCE-based Monte Carlo gradient estimator for three different discrete distributions: binary, categorical, and Plackett-Luce, where the last two are generalizations of the previous case. ARMS uses negatively correlated samples produced by a copula for variance reduction, and leverages importance sampling to produce an unbiased estimate. Our approach also generalizes several other estimators. ARMS with two samples reduces to the recently developed DisARM estimator for binary and categorical distributions, and ARMS with independent samples reduces to the strong self-control baseline LOORF/VarGrad. We evaluate ARMS on several different objectives and datasets. We show that ARMS outperforms the state of the art on training variational autoencoders with binary or categorical latent variables, trained using either the evidence lower bound or the multi sample bound. We also compare our approach on a structured prediction task for training stochastic categorical networks. Lastly, we evaluate ARMS for different Plackett-Luce based objectives, which include permutation and ranking losses, with similar results, and we open source the code publicly.
dc.description.departmentInformation, Risk, and Operations Management (IROM)
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2152/115426
dc.identifier.urihttp://dx.doi.org/10.26153/tsw/42325
dc.language.isoen
dc.subjectMachine learning
dc.subjectStatistics
dc.subjectMonte Carlo
dc.subjectGradient estimation
dc.subjectDiscrete variable
dc.subjectCopula
dc.titleGradient estimation for discrete variables via dependent Monte Carlo samples
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentInformation, Risk, and Operations Management (IROM)
thesis.degree.disciplineComputer Science
thesis.degree.grantorThe University of Texas at Austin
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DIMITRIEV-DISSERTATION-2022.pdf
Size:
3.81 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.45 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description: