Skip to main content



Bayesian Probability in Basic Sentiment Analysis

One of the most interesting and surprising applications of bayesian probability is it’s usefulness in sentiment analysis. You might be surprised at just how  useful bayes theorem is in a field such as sentiment analysis (and, to more extents, machine learning in general), but actually, it is quite fascinating just how useful they are.  The way the equations are set up is the same as we discussed in class: Suppose we have two urns, with one urn being labeled as positive, and one urn being labeled as negatives. We sort reviews on some website based on their score (e.g., if out of 10, all above 6 are in positive, below 4 are in negative, and middle are discarded for ambiguity). Then, you look at every single word in each review and store them in the urn corresponding to their review.  This is your dataset, filled with words that are added to urns over and over. This means that you can get the sentiment of a word given it’s probability of appearing in a certain urn (where the urns are positive and negative). You then take arbitrary reviews and set up the probability of positive review given the words in the review, probabiltiy of engative review gievn the words in the review, and see which one is higher., e.g. Pr(Positive|”This restaraunt rocks!”). Surprisingly, word order matters very little; with a large enough dataset, simply the words present are usually enough to indicate sentiment in a review.

 

The first linked article takes this for tweets; researchers first classify tweets into positive or negative sentiment, then further train their algorithm on new tweets, sorting the words in tweets into urns depending on positive or negative, and then comparing the content of the tweet to their divided sentiment urns. With this, they are able to take a live data stream of tweets and analyze them immediately, and then add these new tweets to the data set (wherein they are checked over by the researchers).

 

 

 

https://ieeexplore.ieee.org/abstract/document/7877424?casa_token=eC2rnuoG0_AAAAAA:LE9kEBnxcRBBYcLr4dqjvyjRVGZXEDgXqiKZzfxO2ioqpyDTut1JeEM9S8_OgTpbFTBJjZkoNA
https://towardsdatascience.com/sentiment-analysis-introduction-to-naive-bayes-algorithm-96831d77ac91

Comments

Leave a Reply

Blogging Calendar

December 2022
M T W T F S S
 1234
567891011
12131415161718
19202122232425
262728293031  

Archives