Semisupervised sentiment analysis of tweets based on noisy emoticon labels
MetadataShow full item record
There is high demand for computational tools that can automatically label tweets (Twitter messages) as having positive or negative sentiment, but great effort and expense would be required to build a large enough hand-labeled training corpus on which to apply standard machine learning techniques. Going beyond current keyword-based heuristic techniques, this paper uses emoticons (e.g. ':)' and ':(') to collect a large training set with noisy labels using little human intervention and trains a Maximum Entropy classifier on that training set. Results on two hand-labeled test corpora are compared to various baselines and a keyword-based heuristic approach, with the machine learned classifier significantly outperforming both.