Quantitative and modeling aspects of optimal decision making under uncertainty



Journal Title

Journal ISSN

Volume Title



This dissertation focuses on the problem of decision making under uncertainty, more precisely, the quantitative and modeling aspects of "how to acquire and, in turn, exploit information optimally for decision-making in stochastic environments". To address the challenges posed by different types of uncertainty, a range of methods have been developed in the fields of stochastic control under partial information, dynamic information acquisition, data-driven optimization, model uncertainty, and robust optimization. Specifically, this dissertation is composed by two parts: The first part focuses on an offline data-driven decision-making problem with side information. With abundant data routinely collected in many industries to support decision-making, historical data with numerous side information–temporal, spatial, social, or economical–are available prior to the decision making and reveals partial information on the randomness of the problem. The challenge for these high-dimensional problems is that the empirical distribution constructed from the observed data is not representative of the underlying true distribution between contextual information and decisions, and strategies solely based on the empirical data can lead to poor performance when implemented. Therefore, a fundamental problem in data-driven decision-making under uncertainty, as well as in statistical learning, is finding solutions that perform well not only on the observed data but also on new and previously unseen data. To hedge against the distributional uncertainty of the offline dataset, this dissertation provides an end-to-end learning framework, based on distributionally robust stochastic optimization (DRSO), that prescribes a non-parametric policy with certified robustness, provable optimality, and efficient implementation. Specifically, we study policy optimization for a series of feature-based decision-making problems, which seeks an end-to-end policy that renders an explicit mapping from features to decisions. In this dissertation, we first consider a Wasserstein robust optimization framework, where we highlight our contribution in finding an optimal robust policy without restricting onto a parametric family while still maintaining computational efficiency and interpretability. More specifically, by exploiting the structure of the optimal policy, we identify a new class of policies that are proven to be robust optimal and can be computed by linear programming. We apply our work in newsvendor problem. Furthermore, we propose a new uncertainty set based on causal transport distance which contains distributions that share a similar conditional information structure with the nominal distribution. We derive a tractable dual reformulation for evaluating the worst-case expected cost and show that the worst-case distribution has a similar conditional information structure as the nominal distribution. We identify tractable cases to find the optimal decision rules over an affine class or the entire nonparametric class, and apply our work in conditional regression, incumbent pricing and portfolio selection. The second part is concerned with dynamic information acquisition with sequential decision-making and differential information sources. When involving dynamic learning to facilitate decision making, since the decision makers often have imperfect and costly information, they encounter a trade-off between the information learning and the expected payoff, given the limited information. For example, when comparing new technologies, the firm often spends a considerable amount of funds and time on research and development (R&D) to identify the best technology to adopt. Other examples include investors designing algorithms to learn about the return of different assets, scientists conducting research to investigate the validity of different hypotheses, etc. From the viewpoint of dynamic information acquisition, the practically important features are the choice of "what to learn", as well as "when to learn and stop learning". Most of the decision-making problems considered in this line of work are static (i.e. a single, irreversible decision) problems which, however, over-simplify the structure of many real-world applications that require dynamic or sequential decisions. Moreover, the information acquisition source in these studies typically remains constant (e.g. a single noise signal) throughout the decision process, failing to capture the adaptive nature of decision makers in response to stochastically changing environments. Herein, we introduce a general framework in which we allow for both sequential (possibly reversible) decisions and dynamically changing information sources (distinct signals), and it also includes the cost of acquiring information across time. We analyze a benchmark example, motivated by the return/exchange policies in e-commerce platforms. Specifically, we introduce a sequential decision-making problem that allows decision makers to reverse their initial decisions and their costly information acquisition setting to change accordingly. We investigate the optimal strategies for information acquisition and decision reversal, and carry out a complete sensitivity and asymptotic analysis on how decision makers can effectively adapt their learning behavior to ultimately achieve the best decision-making outcomes. In what follows, we describe each approach separately. For each part, we introduce the corresponding model, construct solutions, and provide a detailed analytical methodology.



LCSH Subject Headings