Sentiment Analysis / Opinion Mining
Natural Language Processing (NLP)
Identifying, extracting and quantifying emotional states, polarised opinions and subjective information
Modalities
Single (e.g. text-based)
Multi-modal (e.g. with audio and visual data)
Model as a classification problem
Subjectivity Classification
Objective
Fact
Subjective
Opinion / Experiences (Subjective)
Polarity Classification
Positive
Negative
Neutral
More fine-grained
Emotions
Joy
Surprise
Anger
Disgust
Fear
Sadness
Very Positive
Very Negative
Star ratings
Techniques to classify subjectivity & polarity
Rule-based
Approaches
Lexicon-based
A dictionary of positive & negative opinion words and expressions
Lexical Resources
Sentiwords (~155,000 English words associated with a sentiment polarity score)
Python Resources
VADER (Valence Aware Dictionary and sEntiment Reasoner) lexicon and rule-based sentiment analysis tool
NLTK (Natural Language Toolkit)
Basic lexicon approach - count the score of each word and the most common score wins
Classic NLP - parsing, tokenisation, part-of-speech tagging, stemming
Advantages
No training data required
Easy to debug
Disadvantages
Less accurate
Automated (Machine Learning)
Model
Input is the text - preprocessed into word embeddings/vectors
Output is a prediction of the score/polarity
Algorithms
Naive Bayes - probabilistic to predict a category
Linear Regression
Support Vector Machines (SVMs) - non-probabilistic; mapping categories into multidimensional space
Deep Learning
Advantages
Scalable
More accurate
Disadvantages
Requires large amount of training data
Vocabulary
Types of opinions
Direct opinions - specifically give a judgement/perspective in an up-front, straightforward manner
Comparative opinions - express similarities of, or differences between entities
Sentiment is expressed on whole entities/subjects, or individual features/aspect
Ways opinions can be expressed
Explicitly (expressed in a subjective sentence)
Examples
"the holiday was fantastic"
Implicitly (expressed in an objective sentence, saying without saying, and metaphors).
Examples
"the oysters gave us food poisoning"
The aspect of "size" can be implied from usage of the adjective "bulky"
The aspect "price" can be implied from usage of the adjective "expensive"
Applied at different levels
Document-level
Sentence-level
Phrase-level
Word-level
Types of sentiment analysis
Comparative sentence analysis
Feature/aspect based - expressed on whole entities/subjects or individual parts
Intent analysis - understanding the underlying intent or intended action which may require contextual information
Multilingual
Challenges
Handling negations or double negatives (e.g. do not dislike)
Adverbials changing the meaning/intensity of the sentiment
Tonality, irony and sarcasm
Emojis and special characters
Multiple entities/subjects are named - which one does the sentiment apply to?
Typical negative terms being used positively in certain contexts
Opposing opinions provided, that create an unclear sentiment