# Browsing by Subject "Mixture model"

Now showing 1 - 4 of 4

- Results Per Page
1 5 10 20 40 60 80 100

- Sort Options
Ascending Descending

Item Bayesian estimation of finite mixture roughness model(2016-12) Serigos, Pedro A. (Pedro Antonio); Prozzi, Jorge Alberto; Zhang, Zhanmin; Gilbert, Robert B; Müller, Peter; Mikhail, MagdyShow more Highway infrastructure systems provide a crucial service to society and constitute a major asset with a significant maintenance and rehabilitation cost, highway pavements comprising a major component of the total cost. The increasing need for greater capital investment, in the face of ever-decreasing federal funding to maintain highway infrastructure, highlights the importance of developing and implementing effective methods for managing pavement assets. A key for the success of pavement management is to accurately predict the future condition of the pavements in the network. This dissertation proposes a mixture of regression models to capture the systematic differences in pavement performance not explained by variables typically available in pavement management systems. This approach assumes that the heterogeneous pavement performance, which results from the combined effect of the several unobserved factors and interactions, is manifested through a finite number of latent groups. The estimation of the proposed model allows for defining the parameters of the group-specific models while clustering the observations into the latent groups. The insights provided by the model-based clustering of performance data can also be incorporated into the design of maintenance and rehabilitation strategies, as clustering of sections according to their deterioration rate allows for identifying pavements in the network with structural deficiencies and tailoring actions in response. The gain in model fit, along with the insights provided by the proposed methodology for the unsupervised model-based clustering of pavement performance was demonstrated using experimental data. In addition, the proposed mixture model was applied to develop a Bayesian pavement roughness model specified with variables from an existing pavement management system, plus climatic and preventive maintenance variables, and estimated using nationwide field data from the Long-Term Pavement Performance program. Lastly, the developed roughness mixture model was calibrated for Texas pavement conditions by combining both the nationwide data and data extracted from the processing and merging of various Texas Department of Transportation databases. The proposed methodology produces accurate predictions of the progression of roughness as well as robust estimates of the factor effects driving the deterioration of pavements, which, ultimately, lead to a more efficient management of highway assets.Show more Item Pipe fractional flow theory : principles and applications(2014-01-14) Nagoo, Anand Subhash; Sharma, Mukul M.; Bonnecaze, Roger T; Edgar, Thomas F; Rochelle, Gary T; Lake, Larry WShow more The contribution of this research is a simple, analytical mathematical modeling framework that connects multiphase pipe flow phenomena and satisfactorily reproduces key multiphase pipe flow experimental findings and field observations, from older classic data to modern ones. The proposed unified formulation presents, for the first time, a reliably accurate analytical solution for averaged (1D) multiphase pipe flow over a wide range of applications. The two new fundamental insights provided by this research are that: (a) macroscopic single-phase pipe flow fluid mechanics concepts can be generalized to multiphase pipe flow, and (b): viewing and analyzing multiphase pipe flow in general terms of averaged relative flow (or fractional flow) can lead to a unified understanding of its resultant (global) behavior. The first insight stems from our finding that the universal relationship that exists between pressure and velocity in single-phase flow can also be found equivalently between pressure and relative velocity in multiphase flow. This eliminates the need for a-priori flow pattern determination in calculating multiphase flow pressure gradients. The second insight signifies that, in general, averaged multiphase flow problems can be sufficiently modeled by knowing only the averaged volume fractions. This proves that flow patterns are merely the visual, spatial manifestations of the in-situ velocity and volume fraction distributions (the quantities that govern the transport processes of the flow), which are neatly captured in the averaged sense as different fractional flow paths in our proposed fractional flow graphs. Due to their simplicity, these new insights provide for a deeper understanding of flow phenomena and a broader capability to produce quantitative answers in response to what-if questions. Since these insights do not draw from any precedent in the prior literature, a science-oriented, comprehensive validation of our core analytical principles was performed. Model validation was performed against a diverse range of vapor-liquid, liquid-liquid, fluid-solid and vapor-liquid-liquid applications (over 74,000 experimental measurements from over 110 different labs and over 6,000 field measurements). Additionally, our analytical theory was benchmarked against other modeling methods and current industry codes with identical (unbiased), named published data. The validation and benchmarking results affirm the central finding of this research – that simple, suitably-averaged analytical models can yield an improved understanding and significantly better accuracy than that obtained with extremely complex, tunable models. It is proven that the numerous, continuously interacting (local) flow microphysics effects in a multiphase flow can be (implicitly) accounted for by just a few properly validated (global) closure models that capture their net (resultant) behavior. In essence, it is the claim of this research that there is an underlying simplicity and connectedness in this subject if looking at the resultant macroscopic (averaged) behaviors of the flow. The observed coherencies of the macroscopic, self-organizing physical structures that define the subject are equivalently present in the macroscopic mathematical descriptions of these systems, i.e., the flow-pattern-implicit, averaged-equations mixture models that describe the collective behavior of the flowing mixture.Show more Item Simple, efficient and robust approaches for large scale learning(2020-06-22) Shen, Yanyao; Sanghavi, Sujay Rajendra, 1979-; Huang, Qixing; Dimakis, Georgios-Alex; Caramanis, Constantine; Durrett, Gregory CShow more Robustness of a model plays a vital role in large scale machine learning. Classical estimators in robust statistics do not provide satisfied computational efficiency as data size and model scales. We draw ideas from robust statistics and focus on providing simple and efficient algrithmic paradigms for large scale learning that are provably robust to corrupted training samples. We start from standard supervised and unsupervised problems, and then move towards several semi-supervised settings including mixed linear regression as well as multi-instance multi-label learning. We analyze the algorithms under regular statistical settings with mild assumptions, thus providing theoretical supports for applying the ideas to large scale learning models such as deep neural networks. These simple algorithms serve as strong baselines and have achieved state-of-the-art results on certain tasks. The algorithmic paradigm is applicable to a wide range of problems and our theoretical insights may also guide future research on robust large scale learning.Show more Item Theoretical analysis for convex and non-convex clustering algorithms(2018-05) Yan, Bowei; Sarkar, Purnamrita; Caramanis, Constantine; Mueller, Peter; Walker, StephenShow more Clustering is one of the most important unsupervised learning problem in the machine learning and statistics community. Given a set of observations, the goal is to find the latent cluster assignment of the data points. The observations can be either some covariates corresponding to each data point, or the relational networks representing the affinity between pair of nodes. We study the problem of community detection in stochastic block models and clustering mixture models. The two kinds of problems bear a lot of resemblance, and similar techniques can be applied to solve them. It is common practice to assume some underlying model for the data generating process in order to analyze it properly. With some pre-defined partitions of all data points, generative models can be defined to represent those two types of data observations. For the covariates, the mixture model is one of the most flexible and widely-used models, where each cluster i comes from some distribution D [subscript i], and the entire distribution is a convex sum over all distributions [mathematical equation]. We assume that the data is Gaussian or sub-gaussian, and analyze two algorithms: 1) Expectation-Maximization algorithm, which is notoriously non-convex and sensitive to local optima, and 2) Convex relaxation of the k-means algorithm. We show both methods are consistent under certain conditions when the signal to noise ratio is relatively high. And we obtain the upper bounds for error rate if the signal to noise ration is low. When there are outliers in the data set, we show that the semi-definite relaxation exhibits more robust result compared to spectral methods. For the networks, we consider the Stochastic Block Model (SBM), in which the probability of edge presence is fully determined by the cluster assignments of the pair of nodes. We use a semi-definite programming (SDP) relaxation to learn the clustering matrix, and discuss the role of model parameters. In most SDP relaxations of SBM, the number of communities is required for the algorithm, which is a strong requirement for many real-world applications. In this thesis, we propose to introduce a regularization to the nuclear norm, which is shown to be able to exactly recover both the number of communities and cluster memberships even when the number of communities is unknown. In many real-world networks, it is more common to see both network structure and node covariates simultaneously. In this case, we present a regularization based method to effectively combine the two sources of information. The proposed method works especially well when the covariates and network contain complementary information.Show more