Efficient non-convex algorithms for large-scale learning problems
MetadataShow full item record
The emergence of modern large-scale datasets has led to a huge interest in the problem of learning hidden complex structures. Not only can models from such structures fit the datasets, they also have good generalization performance in the regime where the number of samples are limited compared to the dimensionality. However, one of the main issues is finding computationally efficient algorithms to learn the models. While convex relaxation provides polynomial-time algorithms with strong theoretical guarantees, there are demands for even faster algorithms with competitive performances, due to the large volume of the practical datasets. In this dissertation, we consider three types of algorithms, greedy methods, alternating minimization, and non-convex gradient descent, that have been key non-convex approaches to tackle the large-scale learning problems. For each theme, we focus on a specific problem and design an algorithm based on the designing ideas. We begin with the problem of subspace clustering, where one needs to learn underlying unions of subspaces from a set of data points around the subspaces. We develop two greedy algorithms that can perfectly cluster the points and recover the subspaces. The next problem of interest is collaborative ranking, where underlying low-rank preference matrices are to be learned from pairwise comparisons of the entries. We present an alternating minimization based algorithm. Finally, we develop a non-convex gradient descent algorithm for general low-rank matrix optimization problems. All of these algorithms exhibit low computational complexities as well as competitive statistical performances, which make them scalable and suitable for a variety of practical applications of the problems. Analysis of the algorithms provides theoretical guarantees of their performances.