Factorial Hidden Markov Models for full and weakly supervised supertagging

dc.contributor.advisorMooney, Raymond J. (Raymond Joseph)en
dc.contributor.committeeMemberBaldridge, Jasonen
dc.creatorRamanujam, Srivatsanen
dc.date.accessioned2010-06-04T14:44:01Zen
dc.date.available2010-06-04T14:44:01Zen
dc.date.issued2009-08en
dc.date.submittedAugust 2009en
dc.descriptiontexten
dc.description.abstractFor many sequence prediction tasks in Natural Language Processing, modeling dependencies between individual predictions can be used to improve prediction accuracy of the sequence as a whole. Supertagging, involves assigning lexical entries to words based on lexicalized grammatical theory such as Combinatory Categorial Grammar (CCG). Previous work has used Bayesian HMMs to learn taggers for both POS tagging and supertagging separately. Modeling them jointly has the potential to produce more robust and accurate supertaggers trained with less supervision and thereby potentially help in the creation of useful models for new languages and domains. Factorial Hidden Markov Models (FHMM) support joint inference for multiple sequence prediction tasks. Here, I use them to jointly predict part-of-speech tag and supertag sequences with varying levels of supervision. I show that supervised training of FHMM models improves performance compared to standard HMMs, especially when labeled training material is scarce. Secondly, FHMMs trained from tag dictionaries rather than labeled examples also perform better than a standard HMM. Finally, I show that an FHMM and a maximum entropy Markov model can complement each other in a single step co-training setup that improves the performance of both models when there is limited labeled training material available.en
dc.description.departmentComputer Science
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttp://hdl.handle.net/2152/ETD-UT-2009-08-350en
dc.language.isoengen
dc.subjectHidden Markov Modelsen
dc.subjectBayesian Modelsen
dc.subjectCategorial Grammaren
dc.subjectSupertaggingen
dc.subjectJoint Inferenceen
dc.titleFactorial Hidden Markov Models for full and weakly supervised supertaggingen
dc.type.genrethesisen
thesis.degree.departmentComputer Sciencesen
thesis.degree.disciplineComputer Sciencesen
thesis.degree.grantorThe University of Texas at Austinen
thesis.degree.levelMastersen
thesis.degree.nameMaster of Artsen

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RAMANUJAM-THESIS.pdf
Size:
347.78 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.13 KB
Format:
Plain Text
Description: