A Beginner’s Guide to Using BERT for the First Time
From predicting single sentence to fine-tuning using custom dataset to finding the best hyperparameter configuration.
BERT has become a new standard for Natural Language Processing (NLP). It achieved a whole new state-of-the-art on eleven NLP task, including text classification, sequence labeling, question answering, and many more. Even better, it can also give incredible results using only a small amount of data. BERT was first released in 2018 by Google along with its paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
Now we can easily apply BERT to our model by using Huggingface (🤗) Transformers library. The library already provided complete documentation about other transformers models too. You can check it here. In this post, I will try to summarize some important points which we will likely use frequently. We will take a look at how to use and train models using BERT from 🤗 Transformers. Later, you can also utilize other transformers models (such as XLM, RoBERTa, XLM RoBERTa (my favorite!), BART, and many others) by simply changing a single line of code.
Text classification seems to be a pretty good start to get to know BERT. There are many kinds of text classification tasks, but we will choose sentiment analysis in this case. Here are 5 main points which we will be covered in this post:
- Installation
- Pipeline
- Fine-tune
- Using custom dataset
- Hyperparameter search
Installation
As stated on their website, to run 🤗 Transformers you will need to have some requirement as follow:
- Python 3.6+
- Pytorch 1.10+ or Tensorflow 2.0
They also encourage us to use virtual environments to install them, so don’t forget to activate it first.
The installation is quite easy, when Tensorflow or Pytorch had been installed, you just need to type:
pip install transformersIn this post, we are going to use Pytorch. But it should be easy if you want to translate it into Tensorflow, just add ‘TF’ at the beginning of each model class name.
Pipeline
When you just want to test or simply use it to predict some sentences, you can use pipeline(). Besides text classification, they already provided many different tasks such as text generation, question answering, summarization, and so on. To run sentiment analysis task, simply type:
from transformers import pipeline
classifier = pipeline('sentiment-analysis')
result = classifier('We are very happy to show you the 🤗 Transformers library.')It uses a model named “distilbert-base-uncased-finetuned-sst-2-english” by default. We can also change to other models that we can find in the model hub. For example, if we want to use nlptown/bert-base-multilingual-uncased-sentiment, then simply do the following:
classifier = pipeline(‘sentiment-analysis’, model=”nlptown/bert-base-multilingual-uncased-sentiment”)Fine-tune
First thing first, we need a dataset. At this point, we are going to use the dataset provided by 🤗 Datasets. They provide a wide range of task options, varying from text classification, token classification, language modeling, and many more. To install it, simply execute the following line:
pip install datasetsLoad data
We are going to use sst2 dataset from GLUE task and bert-base-uncased pretrained. By runningload_dataset and load_metric, we are downloading dataset as well as metric. load_metricautomatically loads a metric associated with the chosen task.






