Sentiment Analysis Without Modeling: TextBlob vs. VADER vs. Flair
Sentiment Prediction Accuracy Comparison
Sentiment analysis can be done with or without building a machine learning model. This article will go over the Python implementation of TextBlob, VADER, and Flair for non-model sentiment analysis.
After reading the article, you will learn
- What is TextBlob?
- What is VADER?
- What is Flair?
- How to use the Python library TextBlob for sentiment analysis?
- How to use Python library VADER for sentiment analysis?
- How to use Flair’s pre-trained model in Python for sentiment analysis?
- Comparison between TextBlob vs. VADER vs. Flair.
Resources for this post:
- Video tutorial on YouTube
- Python code is at the end of the post. Click here for the notebook.
- More video tutorials on NLP
- More blog posts on NLP
Let’s get started!
Step 1: Install and Import Python Libraries
The first step is to install and import Python libraries. We need to install the vaderSentiment
package for VADER and theflair
package for flair.
# Install vaderSentiment package for VADER
pip install vaderSentiment
# Install flair
pip install flair
After installing the packages, let’s import the Python libraries. We need to import pandas
and numpy
for data processing.
For the sentiment analysis, we need to import TextBlob
, SentimentIntensityAnalyzer
from vaderSentiment
, and TextClassifier
from flair
. We also need to load the English sentiment data from TextClassifier
and import Sentence
for text processing for the flair pre-trained model.
To check the sentiment prediction accuracy, we need to import accuracy_score
from sklearn
.
Last but not least, we set the pandas
data frame column width to be 1000, which will allow us to see more content from the review.
# Data processing
import pandas as pd
import numpy as np
# Import TextBlob
from textblob import TextBlob
# Import VADER sentiment
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
# Import flair pre-trained sentiment model
from flair.models import TextClassifier
classifier = TextClassifier.load('en-sentiment')
# Import flair Sentence to process input text
from flair.data import Sentence
# Import accuracy_score to check performance
from sklearn.metrics import accuracy_score
# Set a wider colwith
pd.set_option('display.max_colwidth', 1000)
Step 2: Download and Read in Data
The second step is to download and read the dataset.
The UCI Machine Learning Repository has the review data from three websites: imdb.com, amazon.com, and yelp.com. We will use the review data from amazon.com for this tutorial. Please follow these steps to download the data.
- Go to: https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences
- Click “Data Folder”
- Download “sentiment labeled sentences.zip”
- Unzip “sentiment labeled sentences.zip”
- Copy the file “amazon_cells_labelled.txt” to your project folder
Those who are using Google Colab for this analysis need to mount Google Drive to read the dataset. You can ignore the code below if you are not using Google Colab.
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')
# Change directory
import os
os.chdir("drive/My Drive/contents/sentiment_analysis")
# Print out the current directory
!pwd
Output:
Mounted at /content/drive
/content/drive/My Drive/contents/sentiment_analysis
Now let’s read the data into a pandas
data frame and see what the dataset looks like.
# Read in data
amz_review = pd.read_csv('sentiment labelled sentences/amazon_cells_labelled.txt', sep='\t', names=['review', 'label'])
# Take a look at the data
amz_review.head()
.info
helps us to get the information about the dataset.
# Get the dataset information
amz_review.info()
From the output, we can see that this data set has two columns, 1000 records, and no missing data. The ‘review’ column is object
type, and the ‘label’ column is int64
type.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 review 1000 non-null object
1 label 1000 non-null int64
dtypes: int64(1), object(1)
memory usage: 15.8+ KB
Next, let’s check the distribution of the label.
# Check the label distribution
amz_review['label'].value_counts()
There are 500 positive and 500 negative reviews in the dataset, so we have a balanced dataset. For a balanced dataset, we can use accuracy as the performance metric.
1 500
0 500
Name: label, dtype: int64
If you are interested in imbalanced dataset classification, please check out my imbalanced classification and anomaly detection articles.
- Four Oversampling And Under-Sampling Methods For Imbalanced Classification Using Python
- Neural Network Model Balanced Weight For Imbalanced Classification In Keras
- Isolation Forest For Anomaly Detection
- One-Class Support Vector Machine (SVM) For Anomaly Detection link text
- Autoencoder For Anomaly Detection Using Tensorflow Keras
Step 3: What is TextBlob?
TextBlob is a Python library for Natural Language Processing (NLP). Sentiment analysis is one of many NLP tasks that TextBlob supports.
The sentiment property in TextBlob returns a polarity score and a subjectivity score for the input text.
- The polarity score ranges from -1 to 1, where -1 means extremely negative, and 1 means highly positive. A score near 0 means neutral sentiment.
- The subjectivity score ranges from 0 to 1, where 0 means extremely objective and 1 means highly subjective.
# Example text
text = 'GrabNGoInfo.com is a great machine learning tutorial website.'
# TextBlob sentiment
textBlob_sentiment = TextBlob(text).sentiment
textBlob_sentiment
In this example, the sample text ‘GrabNGoInfo.com is a great machine learning tutorial website.’ has a polarity score of 0.8 and the subjectivity score of 0.75, which means TextBlob interprets the sentence to have a subjective positive sentiment.
Sentiment(polarity=0.8, subjectivity=0.75)
We can use .polarity
to extract the sentiment score.
# Extract sentiment score
textBlob_sentiment.polarity
Output:
0.8
Step 4: What is VADER?
VADER (Valence Aware Dictionary and sEntiment Reasoner) is a Python library focusing on social media sentiments. It has a built-in algorithm to change sentiment intensity based on punctuations, slang, emojis, and acronyms.
The output of VADER includes four scores: compound score, negative score, neural score, and positive score.
- The pos, neu, and neg represent the percentage of tokens that fall into each category, so they add up together to be 100%.
- The compound score is a single score to measure the sentiment of the text. Similar to TextBlob, it ranges from -1 (extremely negative) to 1 (extremely positive). The scores near 0 represent the neural sentiment score.
- The compound score is not a simple aggregation of the pos, neu, and neg scores. Instead, it incorporates rule-based enhancements such as punctuation amplifiers.
# Example text
text = 'GrabNGoInfo.com is a great machine learning tutorial website.'
# VADER Sentiment
vader = SentimentIntensityAnalyzer()
vader_sentiment = vader.polarity_scores(text)
vader_sentiment
VADER gave the sample text ‘GrabNGoInfo.com is a great machine learning tutorial website.’ a compound score of 0.6249. There is no negative word in the sentence, so the neg score value is 0. There are 63.1% of neutral words and 36.9% of positive words in the sentence.
{'compound': 0.6249, 'neg': 0.0, 'neu': 0.631, 'pos': 0.369}
The output of VADER is saved as a dictionary. We can extract the compound sentiment score by the key ‘compound’.
# Extract sentiment score
vader_sentiment['compound']
Output:
0.6249
Step 5: What is Flair?
Flair is a state-of-art NLP framework built on PyTorch. It incorporates recent research and provides an easy way to combine different embeddings to various NLP tasks. The pre-trained sentiment model offers a tool for sentiment analysis without training a customized model.
Unlike TextBlob and VADER which output a sentiment score between -1 and 1, flair sentiment output the predicted label with a confidence score. The confidence score ranges from 0 to 1, with 1 being very confident and 0 being very unconfident.
Before predicting the sentiment of a text, the input text needs to be tokenized by Sentence()
. After that, we can see the sample text was split into nine tokens.
# Example text
text = 'GrabNGoInfo.com is a great machine learning tutorial website.'
# Flair tokenization
sentence = Sentence(text)
sentence
Output:
Sentence: "GrabNGoInfo.com is a great machine learning tutorial website ." [− Tokens: 9]
Next, we use .predict
to predict the sentiment.
# Flair sentiment prediction
classifier.predict(sentence)
sentence
The sample text ‘GrabNGoInfo.com is a great machine learning tutorial website.’ has the predicted sentiment of POSITIVE and a confidence score of 0.9895. Since 0.9895 is close to 1, flair is very confident about the predictions.
Sentence: "GrabNGoInfo.com is a great machine learning tutorial website ." [− Tokens: 9 − Sentence-Labels: {'label': [POSITIVE (0.9895)]}]
The predicted label is saved as a value and the prediction confidence is saved as a score. We can use sentence.labels[0].value
and sentence.labels[0].score
separately to extract the values.
# Extract sentiment prediction score
print(f'Flair classified the review as {sentence.labels[0].value} with the score of {sentence.labels[0].score:.2f}')
Output:
Flair classified the review as POSITIVE with the score of 0.99
Step 6: How to Use TextBlob for Sentiment Analysis
In step 6, we will apply TextBlob to the Amazon review dataset and see how it performs.
We first get the sentiment polarity for each review and save the values into a column called ‘scores_TextBlob’. Then check if the polarity score is positive. If the score is greater than or equal to zero, the predicted sentiment for the review is positive (labeled as 1). Otherwise, the predicted sentiment for the review is negative (labeled as 0).
# Get sentiment score for each review
amz_review['scores_TextBlob'] = amz_review['review'].apply(lambda s: TextBlob(s).sentiment.polarity)
# Predict sentiment label for each review
amz_review['pred_TextBlob'] = amz_review['scores_TextBlob'].apply(lambda x: 1 if x >=0 else 0)
amz_review.head()
After getting predictions from TextBlob, let’s check the prediction accuracy.
# Compare Actual and Predicted
accuracy_score(amz_review['label'],amz_review['pred_TextBlob'])
Comparing the actual label with the TextBlob prediction, we get an accuracy score of 0.688, which means that TextBlob predicted the review sentiment 68.8% of the time.
Step 7: How to Use VADER for Sentiment Analysis
In step 7, we will apply VADER to the Amazon review dataset and see how it performs.
We first get the sentiment compound score for each review and save the values into a column called ‘scores_VADER’. Then check if the compound score is positive. If the score is greater than or equal to zero, the predicted sentiment for the review is positive (labeled as 1). Otherwise, the predicted sentiment for the review is negative (labeled as 0).
# Get sentiment score for each review
vader_sentiment = SentimentIntensityAnalyzer()
amz_review['scores_VADER'] = amz_review['review'].apply(lambda s: vader_sentiment.polarity_scores(s)['compound'])
# Predict sentiment label for each review
amz_review['pred_VADER'] = amz_review['scores_VADER'].apply(lambda x: 1 if x >=0 else 0)
amz_review.head()
After getting predictions from VADER, let’s check the prediction accuracy.
# Compare Actual and Predicted
accuracy_score(amz_review['label'],amz_review['pred_VADER'])
Output:
0.768
Comparing the actual label with the VADER prediction, we get an accuracy score of 0.768, which means that VADER predicted the review sentiment 76.8% of the time.
TextBlob has a prediction accuracy of 68.8% for the same dataset, so VADER has an 8% improvement over the TextBlob prediction.
Step 8: How to Use Flair for Sentiment Analysis
In step 8, we will apply the flair pre-trained sentiment model to predict the sentiment of the Amazon reviews.
Firstly, let’s define a function that takes a review as input and the predicted label and confidence as outputs.
# Define a function to get Flair sentiment prediction score
def score_flair(text):
sentence = Sentence(text)
classifier.predict(sentence)
score = sentence.labels[0].score
value = sentence.labels[0].value
return score, value
Then apply the function to each review in the dataset.
# Get sentiment score for each review
amz_review['scores_flair'] = amz_review['review'].apply(lambda s: score_flair(s)[0])
# Predict sentiment label for each review
amz_review['pred_flair'] = amz_review['review'].apply(lambda s: score_flair(s)[1])
# Check the distribution of the score
amz_review['scores_flair'].describe()
We can see that the summary statistics show that most flair scores are close to 1, which means that the flair sentiment model is overall very confident about the predictions.
count 1000.000000
mean 0.988019
std 0.046841
min 0.533640
25% 0.996153
50% 0.999167
75% 0.999887
max 0.999999
Name: scores_flair, dtype: float64
The distribution of the labels is very balanced.
# Check the counts of labels
amz_review['pred_flair'].value_counts()
Output:
POSITIVE 504
NEGATIVE 496
Name: pred_flair, dtype: int64
Before checking the prediction accuracy, we need to map the ‘NEGATIVE’ value to 0 and the ‘POSITIVE’ value to 1 because the Amazon review dataset has true labels of 0 and 1.
# Change the label of flair prediction to 0 if negative and 1 if positive
mapping = {'NEGATIVE': 0, 'POSITIVE': 1}
amz_review['pred_flair'] = amz_review['pred_flair'].map(mapping)
After comparing the flair prediction with the actual sentiment label, we can see that flair predicts the sentiment correctly 94.8% of the time! The prediction accuracy is an 18% increase from VADER and a 26% increase from TextBlob!
# Compare Actual and Predicted
accuracy_score(amz_review['label'],amz_review['pred_flair'])
Output:
0.948