Browsing by Subject "Label propagation"

Now showing 1 - 2 of 2

Using social network information in recommender systems
(2011-08) Sudan, Nikita Maple; Ghosh, Joydeep; Baldridge, Jason
Recommender Systems are used to select online information relevant to a given user. Traditional (memory based) recommenders explore the user-item rating matrix and make recommendations based on users who have rated similarly or items that have been rated similarly. With the growing popularity of social networks, recommender systems can benefit from combining history of user preferences with information from the social/trust network of users. This thesis explores two techniques of combining user-item rating history with trust network information to make better user-item rating predictions. The first approach (SCOAL [5]) simultaneously co-clusters and learns separate models for each co-cluster. The co-clustering is based on the user features as well as the rating history. This captures the intuition that certain groups of users have similar preferences for certain groups of items. The grouping of certain users is affected by the similarity in the rating behavior and the trust network. The second graph-based label propagation approach (MAD [27]) works in a transductive setting and propagates ratings of user-item pairs directly on the user social graph. We evaluate both approaches on two large public data-sets from Epinions.com and Flixster.com. The thesis is amongst the first to explore the role of distrust in rating prediction. Since distrust is not as transitive as trust i.e. an enemy's enemy need not be an enemy or a friend, distrust can't directly replace trust in trust propagation approaches. By using a low dimensional representation of the original trust network in SCOAL, we use distrust as it is and don't propagate it. Using SCOAL, we can pin-point the groups of users and the groups of items that have the same preference model. Both SCOAL and MAD are able to seamlessly integrate side information such as item-subject and item-author information into the trust based rating prediction model.
Weakly supervised part-of-speech tagging for Chinese using label propagation
(2011-05) Ding, Weiwei, 1985-; Baldridge, Jason; Erk, Katrin
Part-of-speech (POS) tagging is one of the most fundamental and crucial tasks in Natural Language Processing. Chinese POS tagging is challenging because it also involves word segmentation. In this report, research will be focused on how to improve unsupervised Part-of-Speech (POS) tagging using Hidden Markov Models and the Expectation Maximization parameter estimation approach (EM-HMM). The traditional EM-HMM system uses a dictionary, which is used to constrain possible tag sequences and initialize the model parameters. This is a very crude initialization: the emission parameters are set uniformly in accordance with the tag dictionary. To improve this, word alignments can be used. Word alignments are the word-level translation correspondent pairs generated from parallel text between two languages. In this report, Chinese-English word alignment is used. The performance is expected to be better, as these two tasks are complementary to each other. The dictionary provides information on word types, while word alignment provides information on word tokens. However, it is found to be of limited benefit. In this report, another method is proposed. To improve the dictionary coverage and get better POS distribution, Modified Adsorption, a label propagation algorithm is used. We construct a graph connecting word tokens to feature types (such as word unigrams and bigrams) and connecting those tokens to information from knowledge sources, such as a small tag dictionary, Wiktionary, and word alignments. The core idea is to use a small amount of supervision, in the form of a tag dictionary and acquire POS distributions for each word (both known and unknown) and provide this as an improved initialization for EM learning for HMM. We find this strategy to work very well, especially when we have a small tag dictionary. Label propagation provides a better initialization for the EM-HMM method, because it greatly increases the coverage of the dictionary. In addition, label propagation is quite flexible to incorporate many kinds of knowledge. However, results also show that some resources, such as the word alignments, are not easily exploited with label propagation.