Large scale matrix factorization with guarantees: sampling and bi-linearity
MetadataShow full item record
Low rank matrix factorization is an important step in many high dimensional machine learning algorithms. Traditional algorithms for factorization do not scale well with the growing data sizes and there is a need for faster/scalable algorithms. In this dissertation we explore the following two major themes to design scalable factorization algorithms for the problems: matrix completion, low rank approximation (PCA) and semi-definite optimization. (a) Sampling: We develop the optimal way to sample entries of any matrix while preserving its spectral properties. Using this sparse sketch (set of sampled entries) instead of the entire matrix, gives rise to scalable algorithms with good approximation guarantees. (b) Bi-linear factorization structure: We design algorithms that operate explicitly on the factor space instead on the matrix. While bi-linear structure of the factorization, in general, leads to a non-convex optimization problem, we show that under appropriate conditions they indeed recover the solution for the above problems. Both these techniques (individually or in combination) lead to algorithms with lower computational complexity and memory usage. Finally we extend these ideas of sampling and explicit factorization to design algorithms for higher order tensors.