Unsupervised learning for large-scale data

dc.contributor.advisorSanghavi, Sujay Rajendra, 1979-
dc.contributor.advisorDimakis, Alexandros G.
dc.contributor.committeeMemberCaramanis, Constantine
dc.contributor.committeeMemberKlivans, Adam R
dc.contributor.committeeMemberWard, Rachel A
dc.creatorWu, Shanshan, Ph. D.
dc.date.accessioned2021-05-07T20:04:16Z
dc.date.available2021-05-07T20:04:16Z
dc.date.created2019-12
dc.date.issued2019-09-20
dc.date.submittedDecember 2019
dc.date.updated2021-05-07T20:04:17Z
dc.description.abstractUnsupervised learning involves inferring the inherent structures or patterns from unlabeled data. Since there is no label information, the fundamental challenge of unsupervised learning is that the objective function is not explicitly defined. The ubiquity of large-scale datasets adds another layer of complexity to the overall learning problem. When the data size or dimension is large, even algorithms with quadratic runtime may be prohibitive. This thesis presents four large-scale unsupervised learning problems. We start with two density estimation problems: given samples from a one-layer ReLU generative model or a discrete pairwise graphical model, the goal is to recover the parameters of the generative model. We then move to representation learning of high-dimensional sparse data coming from one-hot encoded categorical features. We assume that there are additional but a-priori unknown structures in their support. The goal is to learn a lossless low-dimensional embedding for the given data. Our last problem is to compute low-rank approximations of a matrix product given the individual matrices. We are interested in the setting where the matrices are too large and can only be stored in the disk. For every problem presented in this thesis, we (i) design novel and efficient algorithms to capture the inherent structure from data in an unsupervised manner; (ii) establish theoretical guarantees and compare the empirical performance with the state-of-the-art methods; and (iii) provide source code to support our experimental findings
dc.description.departmentElectrical and Computer Engineering
dc.format.mimetypeapplication/pdf
dc.identifier.urihttps://hdl.handle.net/2152/85596
dc.identifier.urihttp://dx.doi.org/10.26153/tsw/12547
dc.language.isoen
dc.subjectUnsupervised learning
dc.subjectDensity estimation
dc.subjectReLU neural networks
dc.subjectGenerative model
dc.subjectGraphical model
dc.subjectIsing model
dc.subjectStructural learning
dc.subjectRepresentation learning
dc.subjectSparse data
dc.subjectCompressed sensing
dc.subjectLow-rank approximation
dc.subjectDimensionality reduction
dc.titleUnsupervised learning for large-scale data
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentElectrical and Computer Engineering
thesis.degree.disciplineElectrical and Computer Engineering
thesis.degree.grantorThe University of Texas at Austin
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
WU-DISSERTATION-2019.pdf
Size:
2.13 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.84 KB
Format:
Plain Text
Description: