Sentiment analysis on Twitter with stock price and significant keyword correlation

dc.contributor.advisorDhillon, Inderjiten
dc.creatorZhang, Linhaoen
dc.date.accessioned2013-04-30T19:40:05Zen
dc.date.available2013-04-30T19:40:05Zen
dc.date.issued2013en
dc.description.abstractThough uninteresting individually, Twitter messages, or tweets, can provide an accurate reflection of public sentiment on when taken in aggregation. In this paper, we primarily examine the effectiveness of various machine learning techniques on providing a positive or negative sentiment on a tweet corpus. Additionally, we apply extracted twitter sentiment to accomplish two tasks. We first look for a correlation between twitter sentiment and stock prices. Secondly, we determine which words in tweets correlate to changes in stock prices by doing a post analysis of price change and tweets. We accomplish this by mining tweets using Twitter's search API and subsequently processing them for analysis. For the task of determining sentiment, we test the effectiveness of three machine learning techniques: Naive Bayes classification, Maximum Entropy classification, and Support Vector Machines. We discover that SVMs give the highest consistent accuracy through cross validation, but not by much. Additionally, we discuss various approaches in training these classifiers. We then apply our findings to on an intra-day market scale to find that there is very little direct correlation between stock prices and tweet sentiment on specifically an intra-day scale. Next, we improve on the keyword search approach by reverse correlating stock prices to individual words in tweets, finding, reasonably, that certain keywords are more correlated with changes in stock prices. Lastly, we discuss various challenges posed by looking at twitter for performing stock predictions.en
dc.description.departmentComputer Sciencesen
dc.identifier.urihttp://hdl.handle.net/2152/20057en
dc.language.isoengen
dc.subjectTwitteren
dc.subjectsentiment analysisen
dc.subjectcomputer scienceen
dc.subjectnatural language processingen
dc.subjectstocksen
dc.titleSentiment analysis on Twitter with stock price and significant keyword correlationen
dc.typeThesisen

Access full-text files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_Linhao_Thesis.pdf
Size:
605.36 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
license.txt
Size:
1.66 KB
Format:
Item-specific license agreed upon to submission
Description:
No Thumbnail Available
Name:
Linhao_Zhang_consent.pdf
Size:
607.22 KB
Format:
Adobe Portable Document Format
Description:
Consent to UTDR License

Collections