Browsing by Subject "Causal inference"

Now showing 1 - 10 of 10

Bargaining over nature : formal and causal analyses on environments and conflict
(2019-05) Kikuta, Kyosuke; Findley, Michael G., 1976-; Busby, Joshua W; Jessee, Stephan; Wolford, Michael Scott
Despite the growing attention to environmental changes and their consequences on conflict, we still do not know the roles of human responses and strategic interactions. This dissertation is composed of three essays that address this void of knowledge. The central argument is that natural environments do not only directly affect conflict, but their effects are intermediated by human responses, political institutions, and strategic opportunities. In each essay, I elaborate this argument by using formal models, causal inference methods, and geospatial data. The analyses indicate that natural environments do not automatically cause or inhibit conflict, but human’s actions can critically shape the relationship.
Bayesian approaches for inference after selection and model fitting
(2020-09-11) Woody, Spencer Arlen; Scott, James (Statistician); Murray, Jared S.; Carvalho, Carlos M; Zigler, Corwin M; Hoff, Peter
This thesis presents a set of methods unified around the theme of providing valid inference when data are used to answer multiple questions of interest. The first portion takes on the case where the data are used twice, first to select targets of inference, and then a second time to form estimates for these targets. The proposed method uses a Bayesian formulation to give more efficient (shorter) confidence intervals which properly account for selection in order to retain nominal frequentist coverage. The second portion, comprising the bulk of this thesis, formalizes the approach of posterior summarization, unifying an set of ideas originating from the early 2000s. Posterior summarization is the process by which a model is fit to the relevant underlying outcome, and the model is interpreted through a post hoc explortation via lower-dimensional functionals. The data are used only once, to fit the model in the first stage. This approach is applied to interpret predictive trends within nonparametric regression models, select important confounders and perform model specification sensitivity analyses in linear models for causal effect estimation, and detect the presence of heterogeneous treatment effects in observational studies. These methods are applied to several real and simulated datasets.
Bayesian methods for complex data structures, with applications to precision medicine in women’s healthcare
(2020-05) Starling, Jennifer Elizabeth; Scott, James (Statistician); Murray, Jared S; Carvalho, Carlos M; Aiken, Abigail RA
This thesis explores novel Bayesian non-parametric regression techniques for data with complex structures, developed in response to challenges in women's health and obstetrics. Nearly all pregnancy-related research shares a key statistical issue: that most outcomes vary smoothly with gestational age. Models which reflect this smoothness aid in interpretability by aligning model choices with clinical knowledge; from a statistical perspective, smoothing can reduce variance without inflating bias. Existing models tend to smooth over all covariates, or require specification of parametric forms and interactions based on a priori knowledge of maternal and fetal covariates. Current literature does not provide an especially nuanced characterization of these functional forms. Chapter 1 frames these issues in the context of current statistical modeling practices in women's health and obstetrics. Chapter 2 introduces a model for estimating patient-specific stillbirth risk over the course of gestation, with the aim to help obstetricians prevent fetal mortality. In this chapter, we introduce BART with Targeted Smoothing (tsBART), a nonparametric regression model which extends the Bayesian Additive Regression Trees (BART) prior to introduce smoothness over a single target covariate t. TsBART extends BART by parameterizing each tree's terminal nodes with smooth functions of t, rather than independent scalars. Both BART and tsBART capture complex nonlinear relationships and interactions among the predictors, but tsBART guarantees that the response surface is smooth in the target covariate. This improves interpretability and helps regularize the estimate. After introducing and benchmarking the tsBART model, we apply it to pregnancy outcomes data from the National Center for Health Statistics. Our aim is to provide patient-specific estimates of stillbirth risk across gestational age (t), based on maternal and fetal risk factors (x). The results of our analysis show the clear superiority of the tsBART model for quantifying stillbirth risk, thereby providing patients and doctors with better information for managing the risk of fetal mortality. Chapter 3 extends these ideas into the causal inference setting to analyze a new clinical protocol for early medical abortion. We introduce Targeted Smooth Bayesian Causal Forests (tsBCF), a nonparametric Bayesian approach for estimating heterogeneous treatment effects which vary smoothly over a single covariate in the observational data setting. The tsBCF method also induces smoothness by parameterizing terminal tree nodes with smooth functions, and allows for separate regularization of treatment effects versus prognostic effect of control covariates. Smoothing parameters for prognostic and treatment effects can be chosen to reflect prior knowledge or tuned in a data-dependent way. Our aim is to assess the relative effectiveness of simultaneous versus interval administration of mifepristone and misoprostol over the first nine weeks of gestation. The model reflects our expectation that the relative effectiveness varies smoothly over gestation, but not necessarily over other covariates. We demonstrate the performance of the tsBCF method on benchmarking experiments. In Chapter 4, we aim to characterize the relationship between birth weight and maternal pre-eclampsia across gestation at a large maternity hospital in urban Uganda. Key scientific questions we investigate include: 1) how pre-eclampsia compares to other maternal-fetal covariates as a predictor of birth weight; and 2) whether the impact of pre-eclampsia on birthweight varies across gestation. We propose a nonparametric regression model called Projective Smooth BART (psBART), which addresses several key statistical challenges. First, our model correctly encodes the prior medical knowledge that birth weight should vary smoothly and monotonically with gestational age. It also avoids assumptions about functional forms and about how birth weight varies with other covariates. Finally, psBART accounts for the fact that a high proportion (83%) of birth weights in our dataset are rounded to the nearest 100 grams. Such extreme data coarsening is rare in maternity hospitals in high resource obstetrics settings but common for data sets collected in low and middle-income countries (LMICs); this introduces a substantial extra layer of uncertainty into the problem and is a major reason why we adopt a Bayesian approach. The results of our analysis show that pre-eclampsia is a dominant predictor of birth weight in this urban Ugandan setting and is therefore an important risk factor for perinatal mortality. Chapter 5 summarizes our contributions and describes directions for future research.
Causal inference for investigating Parkinson’s disease pathogenesis
(2023-12) Zhai, Jingpeng; Bajaj, Chandrajit
Randomized control trials have long been regarded as the standard method for establishing causal relationships. However, in situations where it is impractical to carry out such trials, observational studies involving natural & random variations along with causal inference methods can be used to reason about causality. Causal inference methods require the expression of expert domain knowledge in the form of a causal model. But what happens in situations where there is little to no prior knowledge? In a dataset with a plethora of variables, how should one identify & isolate potential treatment and outcome variables? For example, Parkinson’s disease (PD) is a disorder with diverse manifestations, multiple proposed molecular pathways but no established etiology. Given a dataset with PD patients and healthy controls, and their clinical data ranging from varying levels of biology, how does one approach causal graph construction? In this paper, we devise a scheme that uses gradient- boost tree ensemble algorithms to identify systematically important features for use in causal graph construction, and attempt to establish causal relationships between them based on biological hierarchy. Lastly, we find one genotype feature of α-synuclein to have a significant causal effect on PD diagnosis.
Essays on Causal Inference with Endogeneity and Missing Data
(2017-05) Feng, Qian, Ph. D.; Donald, Stephen G.; Abrevaya, Jason; Xu, Haiqing; Carvalho, Carlos M.
This dissertation strives to devise novel yet easy-to-implement estima- tion and inference procedures for economists to solve complicated real world problems. It provides by far the most optimal solutions in situations when sample selection is entangled with missing data problems and when treatment effects are heterogenous but instruments only have limited variations. In the first chapter, we investigate the problem of missing instruments and create the generated instrument approach to address it. Specifically, When the missingness of instruments is endogenous, dropping observations can cause biased estimation. This chapter proposes a methodology which uses all the data to do instrumental variables (IV) estimation. The methodology provides consistent estimation with endogenous missingness of instruments. It firstly forms a generated instrument for every observation in the data sample that: a) for observations without instruments, the new instrument is an imputation; b) for observations with instruments, the new instrument is an inverse propensity score weighted combination of the original instrument and an imputation. The estimation then proceeds by using the generated instruments. Asymptotic theorems are established. The new estimator attains the semiparametric efficiency bound. It is also less biased compared to existing procedures in the simulations. As an illustrative example, we use the NLSYM data set in which IQ scores are partially missing, and demonstrate that by adopting the new methodology the return to education is larger and more precisely estimated compared to standard complete case methods. In the second chapter, we provide Lasso-type of procedures for reduced form regression with many missing instruments. The methodology takes two steps. In the first step, we generate a rich instrument set from the many missing instruments and other observed data. In the second step, IV estimation is conduced based on the generated instrument set. Specifically, the (very) many generated instruments are used to approximate a “pseudo” optimal instrument in the reduced form regression. The approach has been shown to have efficiency gains compared to the generated instrument estimator developed in the first chapter. We also compare the finite sample behavior of the new estimator with other Lasso estimator and demonstrate the good performance of the proposed estimator in the Monte Carlo experiments. The third chapter estimates individual treatment effects in a triangular model with binary–valued endogenous treatments. This chapter is based on the previous joint work with Quang Vuong and Haiqing Xu. Following the identification strategy established in (Vuong and Xu, forthcoming), we propose a two-stage estimation approach. First, we estimate the counterfactual outcome and hence the individual treatment effect (ITE) for every observational unit in the sample. Second, we estimate the density of individual treatment effects in the population. Our estimation method does not suffer from the ill-posed inverse problem associated with inverting a non–linear functional. Asymptotic properties of the proposed method are established. We study its finite sample properties in Monte Carlo experiments. We also illustrate our approach with an empirical application assessing the effects of 401(k) retirement programs on personal savings. Our results show that there exists a small but statistically significant proportion of individuals who experience negative effects, although the majority of ITEs is positive.
Graph theoretic results on index coding, causal inference and learning graphical models
(2016-08) Shanmugam, Karthikeyan; Dimakis, Alexandros G.; Sanghavi, Sujay; Shakkottai, Sanjay; Caramanis, Constantine; Zuckerman, David
Exploiting and learning graph structures is becoming ubiquitous in Network Information Theory and Machine Learning. The former deals with efficient communication schemes in a many-node network. In the latter, inferring graph structured relationships from high dimensional data is important. In this dissertation, some graph theoretic results in these two areas are presented. The first part deals with the problem of optimizing bandwidth resources for a shared broadcast link serving many users each having access to cached content. This problem and its variations are broadly called Index Coding. Index Coding is fundamental to understanding multi-terminal network problems and has applications in networks that deploy caches. The second part deals with the resources required for learning a network structure that encodes distributional and causal relationships among many variables in machine learning. The number of samples needed to learn graphical models that capture crucial distributional information is studied. For learning causal relationships, when passive data acquisition is not sufficient, the number of interventions required is investigated. In the first part, efficient algorithms for placing popular content in a network that deploys a distributed system of caches are provided. Then, the Index Coding problem is considered: every user has its own cache content that is given and transmissions on a shared link are to be optimized. All graph theoretic schemes for Index Coding, known prior to this work, are shown to perform within a constant factor from the one based on graph coloring. Then, `partial' flow-cut gap results for information flow in a multi-terminal network are obtained by leveraging Index Coding ideas. This provides a poly-logarithmic approximation for a known generalization of multi-cut. Finally, optimal cache design in Index Coding for an adversarial demand pattern is considered. Near-optimal algorithms for cache design and delivery within a broad class of schemes are presented. In the second part, sample complexity lower bounds considering average error for learning random Ising Graphical Models, sampled from Erdós-Rényi ensembles, are obtained. Then, the number of bounded interventions required to learn a network of causal relationships under the Pearls model is studied. Upper and lower bounds on the number of size bounded interventions required for various classes of graphs are obtained.
Interpretable random structure for non-standard data with applications to biomedical and social science studies
(2022-05-02) Liu, Zhengqing; Müller, Peter, 1963 August 9-; Sarkar, Abhra; Somer-Topcu, Zeynep; Walker, Stephen G; Zigler, Corwin M
In this dissertation, we introduce examples of non-standard data that occur in statistical inference for biomedical and social science research. We develop novel statistical methodology that aims to make interpretable and practically meaningful inference from the datasets. The first project proposes a hypothesis testing procedure applied to phylogenetic trees that represent evolutionary paths of seasonal influenza strains. The method quantifies the change of genetic composition of seasonal influenza over years and serves as a crucial step to inform vaccine selection. The second project uses information from atmospherical studies that describe the movement and dispersion of pollutants emitted from coal powered factory. The goal is to make valid causal statement on how intervention applied at factories may affect the health outcome of surrounding community, which is an important step to make policy recommendation. The third project proposes a joint model to analyze tweet contents posted by UK general election candidates. The proposed method effectively borrows information from external source, such as concurrent newspaper. The joint model describes the types of issues candidates tend to focus on, given their demographic information and the election constituency they represent.
Methodological problems in causal inference, with reference to transitional justice
(2014-08) Lee, Byung-Jae; Luskin, Robert C.
This dissertation addresses methodological problems in causal inference in the presence of time-varying confounding, and provides methodological tools to handle the problems within the potential outcomes framework of causal inference. The time-varying confounding is common in longitudinal observational studies, in which the covariates and treatments are interacting and changing over time in response to the intermediate outcomes and changing circumstances. The existing approaches in causal inference are mostly focused on static single-shot decision-making settings, and have limitations in estimating the effects of long-term treatments on the chronic problems. In this dissertation, I attempt to conceptualize the causal inference in this situation as a sequential decision problem, using the conceptual tools developed in decision theory, dynamic treatment regimes, and machine learning. I also provide methodological tools useful for this situation, especially when the treatments are multi-level and changing over time, using inverse probability weights and $g$-estimation. Substantively, this dissertation examines transitional justice's effects on human rights and democracy in emerging democracies. Using transitional justice as an example to illustrate the proposed methods, I conceptualize the adoption of transitional justice by a new government as a sequential decision-making process, and empirically examine the comparative effectiveness of transitional justice measures --- independently or in combination with others --- on human rights and democracy.
Scalable and causal Bayesian inference
(2021-08-30) Chavez, Omar Demian; Williamson, Sinead; Daniels, Michael J; Linero, Antonio; Shively, Tom
This thesis will focus on two facets of Bayesian estimation. First, we propose methods that can improve parameter estimation in particle filtering when making use of a distributed computing environment by allowing for periodic communication between compute nodes. The periodic communication can improve the embarrassingly parallel version of our particle filter without dramatically increasing the computational costs. Our method is intended for use on data with large N or in streaming settings where latent parameters are changing over time. Secondly, we propose a method for estimating heterogeneous treatment effects in observational studies using transformed response variables via a modification to Bayesian additive regression trees that incorporates a mixture model in the regression error terms.
Spatial applications of Markov random fields and neural networks for spatio-temporal denoising, causal inference and reinforcement learning
(2022-08-16) García Tec, Mauricio Benjamín; Scott, James (Statistician); Zigler, Corwin Matthew, 1983-; Zhou, Mingyuan; Walker, Stephen G; Stone, Peter H
Discrete spatial structures are ubiquitous in statistical analysis. They can take the form of images, grids, and more generally, graphs. This work develops novel methodology leading to broadly applicable algorithms of graph smoothing and neural newtorks to improve statistical learning in a variety of tasks and spatially-structured domains, including temporal and sequential decision-making processes. Thus, each chapter corresponds to a case study with applications in spatio-temporal denoising, causal inference, and reinforcement learning. Graph smoothing methods are used in all of them and their effectiveness is evaluated. In addition, some chapters develop more specialized methods that further exploit the spatial and statistical structure of the data. One of the objectives sustained throughout the work will be developing scalable algorithms to handle high-resolution spatial data or other computationally demanding scenarios.