avatarAmy @GrabNGoInfo

Summary

The web content provides a comprehensive comparison of sentiment analysis techniques using TextBlob, VADER, and Flair in Python without the need for model training.

Abstract

The article "Sentiment Analysis Without Modeling: TextBlob vs. VADER vs. Flair" delves into the application of three Python libraries for sentiment analysis: TextBlob, VADER, and Flair. It guides readers through the process of installing and using these libraries to analyze sentiment in text data, specifically reviews from Amazon. The performance of each library is evaluated based on their accuracy in predicting sentiment labels, with Flair demonstrating superior accuracy, followed by VADER, and then TextBlob. The article also explains the nuances of sentiment scores provided by each library, the speed at which they operate, and the methodologies they employ, offering insights into their practical effectiveness for sentiment analysis tasks.

Opinions

  • The author suggests that Flair's pre-trained sentiment model outperforms TextBlob and VADER in terms of prediction accuracy, with a significant 18% increase over VADER and a 26% increase over TextBlob.
  • TextBlob and VADER are described as lexicon and rule-based, while Flair is noted for being model-based, which may contribute to its higher accuracy but slower performance.
  • The sentiment scores from TextBlob and VADER range from -1 to 1, indicating negative to positive sentiment, whereas Flair provides a confidence score between 0 and 1, reflecting the model's confidence in its predictions.
  • The article implies that VADER is more suitable for social media text sentiment analysis due to its ability to interpret punctuation, slang, emojis, and acronyms.
  • The author emphasizes the importance of a balanced dataset for classification tasks and provides additional resources for readers interested in imbalanced dataset classification and anomaly detection.
  • The sentiment analysis process is presented as user-friendly, with code snippets and a video tutorial provided to facilitate understanding and implementation.
  • The article is part of a series of tutorials and resources offered by the author, indicating a commitment to educating readers on various aspects of natural language processing (NLP) and machine learning.

Sentiment Analysis Without Modeling: TextBlob vs. VADER vs. Flair

Sentiment Prediction Accuracy Comparison

Photo by Guillaume Baudusseau on Unsplash

Sentiment analysis can be done with or without building a machine learning model. This article will go over the Python implementation of TextBlob, VADER, and Flair for non-model sentiment analysis.

After reading the article, you will learn

  • What is TextBlob?
  • What is VADER?
  • What is Flair?
  • How to use the Python library TextBlob for sentiment analysis?
  • How to use Python library VADER for sentiment analysis?
  • How to use Flair’s pre-trained model in Python for sentiment analysis?
  • Comparison between TextBlob vs. VADER vs. Flair.

Resources for this post:

  • Video tutorial on YouTube
  • Python code is at the end of the post. Click here for the notebook.
  • More video tutorials on NLP
  • More blog posts on NLP

Let’s get started!

Step 1: Install and Import Python Libraries

The first step is to install and import Python libraries. We need to install the vaderSentiment package for VADER and theflair package for flair.

# Install vaderSentiment package for VADER
pip install vaderSentiment
# Install flair
pip install flair

After installing the packages, let’s import the Python libraries. We need to import pandas and numpy for data processing.

For the sentiment analysis, we need to import TextBlob, SentimentIntensityAnalyzer from vaderSentiment, and TextClassifier from flair. We also need to load the English sentiment data from TextClassifier and import Sentence for text processing for the flair pre-trained model.

To check the sentiment prediction accuracy, we need to import accuracy_score from sklearn.

Last but not least, we set the pandas data frame column width to be 1000, which will allow us to see more content from the review.

# Data processing
import pandas as pd
import numpy as np
# Import TextBlob
from textblob import TextBlob
# Import VADER sentiment
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
# Import flair pre-trained sentiment model
from flair.models import TextClassifier
classifier = TextClassifier.load('en-sentiment')
# Import flair Sentence to process input text
from flair.data import Sentence
# Import accuracy_score to check performance
from sklearn.metrics import accuracy_score
# Set a wider colwith
pd.set_option('display.max_colwidth', 1000)

Step 2: Download and Read in Data

The second step is to download and read the dataset.

The UCI Machine Learning Repository has the review data from three websites: imdb.com, amazon.com, and yelp.com. We will use the review data from amazon.com for this tutorial. Please follow these steps to download the data.

  1. Go to: https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences
  2. Click “Data Folder”
  3. Download “sentiment labeled sentences.zip”
  4. Unzip “sentiment labeled sentences.zip”
  5. Copy the file “amazon_cells_labelled.txt” to your project folder

Those who are using Google Colab for this analysis need to mount Google Drive to read the dataset. You can ignore the code below if you are not using Google Colab.

# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')
# Change directory
import os
os.chdir("drive/My Drive/contents/sentiment_analysis")
# Print out the current directory
!pwd

Output:

Mounted at /content/drive
/content/drive/My Drive/contents/sentiment_analysis

Now let’s read the data into a pandas data frame and see what the dataset looks like.

# Read in data
amz_review = pd.read_csv('sentiment labelled sentences/amazon_cells_labelled.txt', sep='\t', names=['review', 'label'])
# Take a look at the data
amz_review.head()

.info helps us to get the information about the dataset.

# Get the dataset information
amz_review.info()

From the output, we can see that this data set has two columns, 1000 records, and no missing data. The ‘review’ column is object type, and the ‘label’ column is int64 type.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 2 columns):
#   Column  Non-Null Count  Dtype
---  ------  --------------  -----
0   review  1000 non-null   object
1   label   1000 non-null   int64
dtypes: int64(1), object(1)
memory usage: 15.8+ KB

Next, let’s check the distribution of the label.

# Check the label distribution
amz_review['label'].value_counts()

There are 500 positive and 500 negative reviews in the dataset, so we have a balanced dataset. For a balanced dataset, we can use accuracy as the performance metric.

1    500
0    500
Name: label, dtype: int64

If you are interested in imbalanced dataset classification, please check out my imbalanced classification and anomaly detection articles.

Step 3: What is TextBlob?

TextBlob is a Python library for Natural Language Processing (NLP). Sentiment analysis is one of many NLP tasks that TextBlob supports.

The sentiment property in TextBlob returns a polarity score and a subjectivity score for the input text.

  • The polarity score ranges from -1 to 1, where -1 means extremely negative, and 1 means highly positive. A score near 0 means neutral sentiment.
  • The subjectivity score ranges from 0 to 1, where 0 means extremely objective and 1 means highly subjective.
# Example text
text = 'GrabNGoInfo.com is a great machine learning tutorial website.'
# TextBlob sentiment
textBlob_sentiment = TextBlob(text).sentiment
textBlob_sentiment

In this example, the sample text ‘GrabNGoInfo.com is a great machine learning tutorial website.’ has a polarity score of 0.8 and the subjectivity score of 0.75, which means TextBlob interprets the sentence to have a subjective positive sentiment.

Sentiment(polarity=0.8, subjectivity=0.75)

We can use .polarity to extract the sentiment score.

# Extract sentiment score
textBlob_sentiment.polarity

Output:

0.8

Step 4: What is VADER?

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a Python library focusing on social media sentiments. It has a built-in algorithm to change sentiment intensity based on punctuations, slang, emojis, and acronyms.

The output of VADER includes four scores: compound score, negative score, neural score, and positive score.

  • The pos, neu, and neg represent the percentage of tokens that fall into each category, so they add up together to be 100%.
  • The compound score is a single score to measure the sentiment of the text. Similar to TextBlob, it ranges from -1 (extremely negative) to 1 (extremely positive). The scores near 0 represent the neural sentiment score.
  • The compound score is not a simple aggregation of the pos, neu, and neg scores. Instead, it incorporates rule-based enhancements such as punctuation amplifiers.
# Example text
text = 'GrabNGoInfo.com is a great machine learning tutorial website.'
# VADER Sentiment
vader = SentimentIntensityAnalyzer()
vader_sentiment = vader.polarity_scores(text)
vader_sentiment

VADER gave the sample text ‘GrabNGoInfo.com is a great machine learning tutorial website.’ a compound score of 0.6249. There is no negative word in the sentence, so the neg score value is 0. There are 63.1% of neutral words and 36.9% of positive words in the sentence.

{'compound': 0.6249, 'neg': 0.0, 'neu': 0.631, 'pos': 0.369}

The output of VADER is saved as a dictionary. We can extract the compound sentiment score by the key ‘compound’.

# Extract sentiment score
vader_sentiment['compound']

Output:

0.6249

Step 5: What is Flair?

Flair is a state-of-art NLP framework built on PyTorch. It incorporates recent research and provides an easy way to combine different embeddings to various NLP tasks. The pre-trained sentiment model offers a tool for sentiment analysis without training a customized model.

Unlike TextBlob and VADER which output a sentiment score between -1 and 1, flair sentiment output the predicted label with a confidence score. The confidence score ranges from 0 to 1, with 1 being very confident and 0 being very unconfident.

Before predicting the sentiment of a text, the input text needs to be tokenized by Sentence(). After that, we can see the sample text was split into nine tokens.

# Example text
text = 'GrabNGoInfo.com is a great machine learning tutorial website.'
# Flair tokenization
sentence = Sentence(text)
sentence

Output:

Sentence: "GrabNGoInfo.com is a great machine learning tutorial website ."   [− Tokens: 9]

Next, we use .predict to predict the sentiment.

# Flair sentiment prediction
classifier.predict(sentence)
sentence

The sample text ‘GrabNGoInfo.com is a great machine learning tutorial website.’ has the predicted sentiment of POSITIVE and a confidence score of 0.9895. Since 0.9895 is close to 1, flair is very confident about the predictions.

Sentence: "GrabNGoInfo.com is a great machine learning tutorial website ."   [− Tokens: 9Sentence-Labels: {'label': [POSITIVE (0.9895)]}]

The predicted label is saved as a value and the prediction confidence is saved as a score. We can use sentence.labels[0].value and sentence.labels[0].score separately to extract the values.

# Extract sentiment prediction score
print(f'Flair classified the review as {sentence.labels[0].value} with the score of {sentence.labels[0].score:.2f}')

Output:

Flair classified the review as POSITIVE with the score of 0.99

Step 6: How to Use TextBlob for Sentiment Analysis

In step 6, we will apply TextBlob to the Amazon review dataset and see how it performs.

We first get the sentiment polarity for each review and save the values into a column called ‘scores_TextBlob’. Then check if the polarity score is positive. If the score is greater than or equal to zero, the predicted sentiment for the review is positive (labeled as 1). Otherwise, the predicted sentiment for the review is negative (labeled as 0).

# Get sentiment score for each review
amz_review['scores_TextBlob'] = amz_review['review'].apply(lambda s: TextBlob(s).sentiment.polarity)
# Predict sentiment label for each review
amz_review['pred_TextBlob'] = amz_review['scores_TextBlob'].apply(lambda x: 1 if x >=0 else 0)
amz_review.head()

After getting predictions from TextBlob, let’s check the prediction accuracy.

# Compare Actual and Predicted
accuracy_score(amz_review['label'],amz_review['pred_TextBlob'])

Comparing the actual label with the TextBlob prediction, we get an accuracy score of 0.688, which means that TextBlob predicted the review sentiment 68.8% of the time.

Step 7: How to Use VADER for Sentiment Analysis

In step 7, we will apply VADER to the Amazon review dataset and see how it performs.

We first get the sentiment compound score for each review and save the values into a column called ‘scores_VADER’. Then check if the compound score is positive. If the score is greater than or equal to zero, the predicted sentiment for the review is positive (labeled as 1). Otherwise, the predicted sentiment for the review is negative (labeled as 0).

# Get sentiment score for each review
vader_sentiment = SentimentIntensityAnalyzer()
amz_review['scores_VADER'] = amz_review['review'].apply(lambda s: vader_sentiment.polarity_scores(s)['compound'])
# Predict sentiment label for each review
amz_review['pred_VADER'] = amz_review['scores_VADER'].apply(lambda x: 1 if x >=0 else 0)
amz_review.head()

After getting predictions from VADER, let’s check the prediction accuracy.

# Compare Actual and Predicted
accuracy_score(amz_review['label'],amz_review['pred_VADER'])

Output:

0.768

Comparing the actual label with the VADER prediction, we get an accuracy score of 0.768, which means that VADER predicted the review sentiment 76.8% of the time.

TextBlob has a prediction accuracy of 68.8% for the same dataset, so VADER has an 8% improvement over the TextBlob prediction.

Step 8: How to Use Flair for Sentiment Analysis

In step 8, we will apply the flair pre-trained sentiment model to predict the sentiment of the Amazon reviews.

Firstly, let’s define a function that takes a review as input and the predicted label and confidence as outputs.

# Define a function to get Flair sentiment prediction score
def score_flair(text):
  sentence = Sentence(text)
  classifier.predict(sentence)
  score = sentence.labels[0].score
  value = sentence.labels[0].value
  return score, value

Then apply the function to each review in the dataset.

# Get sentiment score for each review
amz_review['scores_flair'] = amz_review['review'].apply(lambda s: score_flair(s)[0])
# Predict sentiment label for each review
amz_review['pred_flair'] = amz_review['review'].apply(lambda s: score_flair(s)[1])
# Check the distribution of the score
amz_review['scores_flair'].describe()

We can see that the summary statistics show that most flair scores are close to 1, which means that the flair sentiment model is overall very confident about the predictions.

count    1000.000000
mean        0.988019
std         0.046841
min         0.533640
25%         0.996153
50%         0.999167
75%         0.999887
max         0.999999
Name: scores_flair, dtype: float64

The distribution of the labels is very balanced.

# Check the counts of labels
amz_review['pred_flair'].value_counts()

Output:

POSITIVE    504
NEGATIVE    496
Name: pred_flair, dtype: int64

Before checking the prediction accuracy, we need to map the ‘NEGATIVE’ value to 0 and the ‘POSITIVE’ value to 1 because the Amazon review dataset has true labels of 0 and 1.

# Change the label of flair prediction to 0 if negative and 1 if positive
mapping = {'NEGATIVE': 0, 'POSITIVE': 1}
amz_review['pred_flair'] = amz_review['pred_flair'].map(mapping)

After comparing the flair prediction with the actual sentiment label, we can see that flair predicts the sentiment correctly 94.8% of the time! The prediction accuracy is an 18% increase from VADER and a 26% increase from TextBlob!

# Compare Actual and Predicted
accuracy_score(amz_review['label'],amz_review['pred_flair'])

Output:

0.948

Summary: TextBlob vs. VADER vs. Flair

In this article, we compared TextBlob vs. VADER vs. Flair for sentiment analysis.

  • Methodology - VADER and TextBlob are lexicon and rule-based. - Flair is model-based.
  • Meaning of score - VADER and TextBlob have sentiment scores ranging from -1 to 1, where -1 means very negative, 1 means very positive, and 0 means neutral. - Flair has a score ranging from 0 to 1, where 0 means very unconfident about the prediction and 1 means very confident about the prediction.
  • Speed Performance - Flair is slower than TextBlob and VADER because the computation algorithm is more complicated.
  • Accuracy Performance - Based on the results of this project, flair has the best prediction accuracy, VADER is the second best, and TextBlob is the third.

After reading the article, you have learned

  • What is TextBlob?
  • What is VADER?
  • What is Flair?
  • How to use the Python library TextBlob for sentiment analysis?
  • How to use Python library VADER for sentiment analysis?
  • How to use flair pre-trained model in Python for sentiment analysis?
  • Comparison between TextBlob vs. VADER vs. Flair.

If you are interested in the nuance differences between TextBlob and VADER, please refer to my article TextBlob VS VADER For Sentiment Analysis Using Python

If you are interested in the video version of the tutorial, please check out the video below. Vist GrabNGoInfo website or GrabNGoInfo YouTube Channel for more machine learning tutorials.

Recommended tutorials

Naturallanguageprocessing
Sentiment Analysis
Flair
Textblob
Vader
Recommended from ReadMedium