Regularization in econometrics and finance
This dissertation develops regularization methods for use in finance and econometrics problems. The key methodology introduced is utility-based selection (UBS) -- a procedure for inducing sparsity in statistical models and practical problems requiring the need for simple and parsimonious decisions.
The introduction section describes statistical model selection in light of the "big data hype" and desire to fit rich and complex models. Key emphasis is placed on the fundamental bias-variance tradeoff in statistics. The remaining portions of the introduction tie these notions into the components and procedure of UBS. This latter half frames model selection as a decision and develops the procedure using decision-theoretic principles.
The second chapter applies UBS to portfolio optimization. A dynamic portfolio construction framework is presented, and the asset returns are modeled using a Bayesian dynamic linear model. The focus here is constructing simple, or sparse, portfolios of passive funds. We consider a set of the most liquid exchange traded funds for our empirical analysis.
The third chapter discusses variable selection in seemingly unrelated regression models (SURs). UBS is applied in this context where an analyst wants to find, among p available predictors, what subset are most relevant for describing variation in q different responses. The selection procedure takes into account uncertainty in both the responses and predictors. It is applied to a popular problem in asset pricing -- discovering which factors (predictors) are relevant for pricing the cross section of asset returns (responses). We also discuss future work in monotonic function estimation and how UBS is applied in this context.
The fourth chapter considers regularization in treatment effect estimation using linear regression. It introduces "regularization-induced confounding" (RIC), a pitfall of employing naive regularization techniques for estimating a treatment effect from observational data. A new model parameterization is presented that mitigates RIC. Additionally, we discuss recent work that considers uncertainty characterization when model errors may vary by clusters of data. These developments employ empirical-Bayes and bootstrapping techniques.