Dirty statistical models

Show full record

Title: Dirty statistical models
Author: Jalali, Ali, 1982-
Abstract: In fields across science and engineering, we are increasingly faced with problems where the number of variables or features we need to estimate is much larger than the number of observations. Under such high-dimensional scaling, for any hope of statistically consistent estimation, it becomes vital to leverage any potential structure in the problem such as sparsity, low-rank structure or block sparsity. However, data may deviate significantly from any one such statistical model. The motivation of this thesis is: can we simultaneously leverage more than one such statistical structural model, to obtain consistency in a larger number of problems, and with fewer samples, than can be obtained by single models? Our approach involves combining via simple linear superposition, a technique we term dirty models. The idea is very simple: while any one structure might not capture the data, a superposition of structural classes might. Dirty models thus searches for a parameter that can be decomposed into a number of simpler structures such as (a) sparse plus block-sparse, (b) sparse plus low-rank and (c) low-rank plus block-sparse. In this thesis, we propose dirty model based algorithms for different problems such as multi-task learning, graph clustering and time-series analysis with latent factors. We analyze these algorithms in terms of the number of observations we need to estimate the variables. These algorithms are based on convex optimization and sometimes they are relatively slow. We provide a class of low-complexity greedy algorithms that not only can solve these optimizations faster, but also guarantee the solution. Other than theoretical results, in each case, we provide experimental results to illustrate the power of dirty models.
Department: Electrical and Computer Engineering
Subject: Structure learning Statistical inference Dirty models High-dimensional statistics Machine learning Sparse and low-rank decomposition Graph clustering Time series analysis Greedy dirty algorithms
URI: http://hdl.handle.net/2152/ETD-UT-2012-05-5088
Date: 2012-05

Files in this work

Size: 10.51Mb
Format: application/pdf

This work appears in the following Collection(s)

Show full record

Advanced Search


My Account