# Browsing by Subject "Non-convex optimization"

Now showing 1 - 4 of 4

- Results Per Page
1 5 10 20 40 60 80 100

- Sort Options
Ascending Descending

Item Efficient non-convex algorithms for large-scale learning problems(2016-12) Park, Dohyung; Sanghavi, Sujay Rajendra, 1979-; Caramanis, Constantine; Ghosh, Joydeep; Dimakis, Alexandros G; Price, EricShow more The emergence of modern large-scale datasets has led to a huge interest in the problem of learning hidden complex structures. Not only can models from such structures fit the datasets, they also have good generalization performance in the regime where the number of samples are limited compared to the dimensionality. However, one of the main issues is finding computationally efficient algorithms to learn the models. While convex relaxation provides polynomial-time algorithms with strong theoretical guarantees, there are demands for even faster algorithms with competitive performances, due to the large volume of the practical datasets. In this dissertation, we consider three types of algorithms, greedy methods, alternating minimization, and non-convex gradient descent, that have been key non-convex approaches to tackle the large-scale learning problems. For each theme, we focus on a specific problem and design an algorithm based on the designing ideas. We begin with the problem of subspace clustering, where one needs to learn underlying unions of subspaces from a set of data points around the subspaces. We develop two greedy algorithms that can perfectly cluster the points and recover the subspaces. The next problem of interest is collaborative ranking, where underlying low-rank preference matrices are to be learned from pairwise comparisons of the entries. We present an alternating minimization based algorithm. Finally, we develop a non-convex gradient descent algorithm for general low-rank matrix optimization problems. All of these algorithms exhibit low computational complexities as well as competitive statistical performances, which make them scalable and suitable for a variety of practical applications of the problems. Analysis of the algorithms provides theoretical guarantees of their performances.Show more Item Large scale matrix factorization with guarantees: sampling and bi-linearity(2015-12) Bhojanapalli, Venkata Sesha Pavana Srinadh; Sanghavi, Sujay Rajendra, 1979-; Caramanis, Constantine; Dhillon, Inderjit; Dimakis, Alexandros; Ravikumar, Pradeep; Ward, RachelShow more Low rank matrix factorization is an important step in many high dimensional machine learning algorithms. Traditional algorithms for factorization do not scale well with the growing data sizes and there is a need for faster/scalable algorithms. In this dissertation we explore the following two major themes to design scalable factorization algorithms for the problems: matrix completion, low rank approximation (PCA) and semi-definite optimization. (a) Sampling: We develop the optimal way to sample entries of any matrix while preserving its spectral properties. Using this sparse sketch (set of sampled entries) instead of the entire matrix, gives rise to scalable algorithms with good approximation guarantees. (b) Bi-linear factorization structure: We design algorithms that operate explicitly on the factor space instead on the matrix. While bi-linear structure of the factorization, in general, leads to a non-convex optimization problem, we show that under appropriate conditions they indeed recover the solution for the above problems. Both these techniques (individually or in combination) lead to algorithms with lower computational complexity and memory usage. Finally we extend these ideas of sampling and explicit factorization to design algorithms for higher order tensors.Show more Item Learning with latent structures, robustness and non-linearity : non-convex approaches(2016-12) Yi, Xinyang; Caramanis, Constantine; Dimakis, Georgios-Alex; Price, Eric; Sanghavi, Sujay; Ward, RachelShow more Non-convex optimization based algorithms are ubiquitous in machine learning and statistical estimation, especially in dealing with complex models that are noisy, non-linear or contain latent structures. Popular techniques for non-convex problems such as alternating minimization and gradient descent, are typically easy to derive and use, but mostly lack theoretical foundations. It is difficult to obtain statistical consistency guarantees for these non-convex heuristics due to the existence of local minima in non-convex problems. From algorithmic perspective, it is even unclear whether average cases of some learning problems, which typically involve NP-hardness in the worst case, can be solved in polynomial time. This thesis seeks to tackle these challenges for a few statistical learning problems: mixed linear regression, robust principal component analysis, single index models in both standard and high-dimensional settings. We provide novel non-convex approaches for these problems via exploring theoretical understandings of several techniques favored in practice, including alternating minimization, EM, and non-convex gradient descent. We also present (optimal) statistical consistency guarantees and scalable computational complexities for the proposed algorithms. Our results suggest that in the problems we study, the aforementioned non-convex heuristics can quickly converge to global optimum (or statistically comparable solutions) when initialized within certain local regions around the underlying true parameters.Show more Item Provable alternating minimization for non-convex learning problems(2014-08) Netrapalli, Praneeth Kumar; Sanghavi, Sujay Rajendra, 1979-Show more Alternating minimization (AltMin) is a generic term for a widely popular approach in non-convex learning: often, it is possible to partition the variables into two (or more) sets, so that the problem is convex/tractable in one set if the other is held fixed (and vice versa). This allows for alternating between optimally updating one set of variables, and then the other. AltMin methods typically do not have associated global consistency guarantees; even though they are empirically observed to perform better than methods (e.g. based on convex optimization) that do have guarantees. In this thesis, we obtain rigorous performance guarantees for AltMin in three statistical learning settings: low rank matrix completion, phase retrieval and learning sparsely-used dictionaries. The overarching theme behind our results consists of two parts: (i) devising new initialization procedures (as opposed to doing so randomly, as is typical), and (ii) establishing exponential local convergence from this initialization. Our work shows that the pursuit of statistical guarantees can yield algorithmic improvements (initialization in our case) that perform better in practice.Show more