Browsing by Subject "Hidden Markov model"

Now showing 1 - 6 of 6

Bayesian approaches for modeling protein biophysics
(2014-08) Hines, Keegan; Aldrich, R. W. (Richard W.)
Proteins are the fundamental unit of computation and signal processing in biological systems. A quantitative understanding of protein biophysics is of paramount importance, since even slight malfunction of proteins can lead to diverse and severe disease states. However, developing accurate and useful mechanistic models of protein function can be strikingly elusive. I demonstrate that the adoption of Bayesian statistical methods can greatly aid in modeling protein systems. I first discuss the pitfall of parameter non-identifiability and how a Bayesian approach to modeling can yield reliable and meaningful models of molecular systems. I then delve into a particular case of non-identifiability within the context of an emerging experimental technique called single molecule photobleaching. I show that the interpretation of this data is non-trivial and provide a rigorous inference model for the analysis of this pervasive experimental tool. Finally, I introduce the use of nonparametric Bayesian inference for the analysis of single molecule time series. These methods aim to circumvent problems of model selection and parameter identifiability and are demonstrated with diverse applications in single molecule biophysics. The adoption of sophisticated inference methods will lead to a more detailed understanding of biophysical systems.
Hidden Markov model and financial application
(2016-08) Li, Na; Lin, Lizhen; Margaret, Myers
A Hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with numerous unobserved (hidden) states. This report applies HMM to financial time series data to explore the underlying regimes that can be predicted by the model. These underlying regimes can be used as an important signal of market environments and used as guidance by investors to adjust their portfolio to maximize the performance. This report is composed of three chapters. The 1st chapter will introduce the difficulties in predicting financial time series, the limitations with traditional time series models, justification for choosing HMM and previous studies. The 2nd chapter will go through a detailed overview of HMM model, including the basic math frame works, and fundamental questions and algorithm to be addressed by the model. In the 3rd chapter, the trend analysis of the stock market is found using Hidden Markov Model. For a given observation sequence, the hidden sequence of states and their corresponding probability values are found. This analysis builds a platform for investors to decision makers to make decisions on the basis of probability and pattern of transition of each hidden state which cannot be observed from market data.
Segmentation of highway networks for maintenance operations
(2016-05) Kim, Moo Yeon; Williamson, Sinead; Prozzi, Jorge A.
Pavement maintenance and rehabilitation (M&R) is important for transportation agencies to have a sustainable transportation infrastructure. In maintenance operations, obtaining limits of homogeneous sections is a key problem because appropriate segmentation can help yield a more cost effective M&R plan. The purpose of this study is to present the result of investigation on various research works and to suggest the direction of developing an enhanced methodological framework. Existing approaches for pavement segmentation was explored through a literature review and data analysis. Autocorrelation tests, change-point approaches, a Bayesian method, and a hidden Markov model were performed using pavement condition data. Future work directions were suggested to develop a segmentation method capable of handling the issues found in the study.
Topics in computational statistics with applications in finance
(2023-12) Rotiroti, Frank; Walker, Stephen G., 1945-; Carvalho, Carlos Marinho, 1978-; Zitkovic, Gordan; Murray, Jared S
The dissertation comprises two parts, each commenced with an introduction to the principal ideas and methods therein: The first part concerns topics related to marginal and intractable likelihood estimation, focusing on the estimation of a density function at a particular point. In particular, we present a Monte Carlo estimator based on the Fourier integral theorem. The second part concerns Bayesian approaches to state-space models with applications in finance. First, we introduce a Bayesian vector autoregression to examine strategic asset allocation for long-run investors, given estimation risk and the choice of multiple risky assets. Then, we devise a variation on the Bayesian additive regression trees (BART) framework to incorporate time-dependent data, as well as stochastic volatility (SV), before applying this approach to the problem of predicting a firm’s stock return with observable firm characteristics. Joining the two parts is an interlude, which describes an approach to the particle filtering of hidden Markov models which reverses the standard sampling-resampling perspective and, along with several simulation studies, includes an example involving an SV model.
Uncertainty quantification and its properties for hidden Markov models with application to condition based maintenance
(2017-12) Zhang, Deyi, Ph. D.; Djurdjanovic, Dragan; Hasenbein, John; Bickel, James Eric; Walker, Stephen G; Hanasusanto, Grani
Condition-based maintenance (CBM) can be viewed as a transformation of data gathered from a piece of equipment into information about its condition, and further into decisions on what to do with the equipment. Hidden Markov model (HMM) is a useful framework to probabilistically model the condition of complex engineering systems with partial observability of the underlying states. Condition monitoring and prediction of such type of system requires accurate knowledge of HMM that describes the degradation of such a system with data collected from the sensors mounted on it, as well as understanding of the uncertainty of the HMMs identified from the available data. To that end, this thesis proposes a novel HMM estimation scheme based on the principles of Bayes theorem. The newly proposed Bayesian estimation approach for estimating HMM parameters naturally yields information about model parametric uncertainties via posterior distributions of HMM parameters emanating from the estimation process. In addition, a novel condition monitoring scheme based on uncertain HMMs of the degradation process is proposed and demonstrated on a large dataset obtained from a semiconductor manufacturing facility. Portion of the data was used to build operating mode specific HMMs of machine degradation via the newly proposed Bayesian estimation process, while the remainder of the data was used for monitoring of machine condition using the uncertain degradation HMMs yielded by Bayesian estimation. Comparison with a traditional signature-based statistical monitoring method showed that the newly proposed approach effectively utilizes the fact that its parameters are uncertain themselves, leading to orders of magnitude fewer false alarms. This methodology is further extended to address the practical issue that maintenance interventions are usually imperfect. We propose both a novel non-ergodic and non-homogeneous HMM that assumes imperfect maintenances and a novel process monitoring method capable of monitoring the hidden states considering model uncertainty. Significant improvement in both the log-likelihood of estimated HMM parameters and monitoring performance were observed, compared to those obtained using degradation HMMs that always assumed perfect maintenance. Finally, behavior of the posterior distribution of parameters of unidirectional non- ergodic HMMs modeling in this thesis for degradation was theoretically analyzed in terms of their evolution as more data become available in the estimation process. The convergence problem is formulated as a Bernstein-von Mises theorem (BvMT), and under certain regularity conditions, the sequence of posterior distributions is proven to converge to a Gaussian distribution with variance matrix being the inverse of the Fisher information matrix. An example of a unidirectional HMM is presented for which the regularity conditions are verified, and illustrations of expected theoretical results are given using simulation. The understanding of such convergence of posterior distributions enables one to determine when Bayesian estimation of degradation HMMs is justified and converges toward true model parameters, as well as how much data one then needs to achieve desired accuracy of the resulting model. Understanding of these issues is of utmost important if HMMs are to be used for degradation modeling and monitoring.
Weakly supervised part-of-speech tagging for Chinese using label propagation
(2011-05) Ding, Weiwei, 1985-; Baldridge, Jason; Erk, Katrin
Part-of-speech (POS) tagging is one of the most fundamental and crucial tasks in Natural Language Processing. Chinese POS tagging is challenging because it also involves word segmentation. In this report, research will be focused on how to improve unsupervised Part-of-Speech (POS) tagging using Hidden Markov Models and the Expectation Maximization parameter estimation approach (EM-HMM). The traditional EM-HMM system uses a dictionary, which is used to constrain possible tag sequences and initialize the model parameters. This is a very crude initialization: the emission parameters are set uniformly in accordance with the tag dictionary. To improve this, word alignments can be used. Word alignments are the word-level translation correspondent pairs generated from parallel text between two languages. In this report, Chinese-English word alignment is used. The performance is expected to be better, as these two tasks are complementary to each other. The dictionary provides information on word types, while word alignment provides information on word tokens. However, it is found to be of limited benefit. In this report, another method is proposed. To improve the dictionary coverage and get better POS distribution, Modified Adsorption, a label propagation algorithm is used. We construct a graph connecting word tokens to feature types (such as word unigrams and bigrams) and connecting those tokens to information from knowledge sources, such as a small tag dictionary, Wiktionary, and word alignments. The core idea is to use a small amount of supervision, in the form of a tag dictionary and acquire POS distributions for each word (both known and unknown) and provide this as an improved initialization for EM learning for HMM. We find this strategy to work very well, especially when we have a small tag dictionary. Label propagation provides a better initialization for the EM-HMM method, because it greatly increases the coverage of the dictionary. In addition, label propagation is quite flexible to incorporate many kinds of knowledge. However, results also show that some resources, such as the word alignments, are not easily exploited with label propagation.