Automated sentiment analysis tools promise something very valuable, but do they live up to that promise?
I’ve been seeing a lot of talk recently about sentiment analysis as a tool for tracking online reputation. I’ve been a bit skeptical, but there is a certain appeal to the idea of being able to distinguish between positive and negative messages so they can be handled appropriately, as well as getting an overall idea of the brand’s reputation. So I’ve been doing some testing, and I’ve found that my skepticism was well-founded. Simply put, these tools just aren’t very accurate at determining negative sentiment, for a variety of reasons. Mainly, though, it’s that they’re still basically looking at the words used, and not fully understanding the context. For some examples, I used Sentiment140.com to do a Twitter sentiment analysis on Herbalife. Herbalife is not a client – I chose them because they’ve been in the news a lot lately. The report analyzed 88 tweets, and found 60 positive, 15 negative and 13 neutral, for a total rating of 80% positive vs. 20% negative.
But the reality is quite different. Going through it, I found:
- 12 positives misclassified as neutral
- 11 positives misclassified as negative
- 1 negative misclassified as neutral
- 1 neutral misclassified as negative
So the real results are actually 94% positive vs. the reported 80% positive. Let’s look at some examples:
I really wanna start herbalife ..
— Heidy (@heidymoreno_) July 8, 2013
Reported as neutral, but clearly positive — the tool just doesn’t recognize that in this context, “start” is a positive term.
Thnks to Herbalife Shake and AloeVera drink to relive my mind&body stress and also my gastric and vommiting pain..:)
— Zuhairah (@Zuhaairaah) July 8, 2013
Stress, gastric pain and vomiting. Negative? No, because Herbalife “relived” (maybe the typo affected it, but I doubt it) it.
I don't know if I can take my herbalife shake into the doctors office. This is a problem #dontwannagetintrouble
— Leah Dawson (@LeahShelll) July 8, 2013
Reported as negative, but clearly not. This one simply requires human interpretation.
@Herbalife no refined carbs. Not even an ounce!
— Leon Kioko (@leonkaindikioko) July 8, 2013
“No” and “not even” signal negative, but in this context, they’re a good thing.
Maybe at some point in the future, these tools will be accurate enough to be useful, but at this point, the overall analysis is way too far off to be meaningful, and even the matter of sifting them for processing is questionable. If you’re only trying to look at the negative messages, it might be somewhat useful, because they do seem to err on the side of interpreting sentiment as negative, but why would you want to do that? You really should be responding to positive messages as well as negative anyway. And if a human being still has to make the interpretation, how is the sentiment analysis tool helping?
If you’ve had a different experience with sentiment analysis, or think you know of a tool that’s more accurate, please leave a comment below and let me know.