Summary

The web content describes the use of Simple Transformers, a library built on top of Hugging Face's Transformers, for implementing question answering (QA) systems using BERT, XLNet, XLM, and DistilBERT models, with training and evaluation using the SQuAD 2.0 dataset.

Abstract

The article discusses the application of Transformer models for question answering tasks, emphasizing the simplicity of using the Simple Transformers library. It outlines the process of setting up the environment, preparing the SQuAD 2.0 dataset, and training a QA model. The library leverages pre-trained models like BERT and XLNet, allowing for efficient fine-tuning on the QA task. The article also provides guidance on data preparation, model training, and evaluation, including how to format data and submit predictions to the SQuAD leaderboard for performance assessment. The results section presents the model's performance metrics, acknowledging the challenge of the SQuAD 2.0 benchmark and suggesting potential improvements through hyperparameter tuning and using larger models.

Opinions

The author endorses the Simple Transformers library for its ease of use in implementing question answering systems.
The author suggests that transfer learning with pre-trained Transformer models is essential for state-of-the-art performance in NLP tasks, including question answering.
The article implies that the SQuAD 2.0 dataset is a reputable benchmark for evaluating the performance of QA models.
The author recommends using larger models, such as the 'large' variant, to achieve better results on the SQuAD 2.0 dataset.
The author provides a cost-effective alternative to ChatGPT Plus (GPT-4) by recommending an AI service named ZAI.chat, which offers similar performance at a lower price point.

Question Answering with BERT, XLNET, XLM, and DistilBERT using Simple Transformers

Question: How to use Transformers for Question Answering? Answer: Simple Transformers, duh! (See what I did there?)

Question Answering in NLP

Context: Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural language.

Human: What is a Question Answering system? System: systems that automatically answer questions posed by humans in a natural language

QA has applications in a vast array of tasks including information retrieval, entity extraction, chatbots, and dialogue systems to name but a few. While question answering can be done in various ways, perhaps the most common flavour of QA is selecting the answer from a given context. In other words, the system will pick a span of text from the context that correctly answers the question. If a correct answer cannot be found from the context, the system will merely return an empty string.

Transfer learning with pre-trained Transformer models has become ubiquitous in NLP problems and question answering is no exception. With that in mind, we are going to use BERT to tackle task of question answering!

We’ll be using the Simple Transformers library to easily work with Transformer models.

ThilinaRajapakse/simpletransformers

This library is based on the Transformers library by HuggingFace. Simple Transformers lets you quickly train and…

github.com

Simple Transformers is built on top of the superb Hugging Face Transformers library.

Setup

Install Anaconda or Miniconda Package Manager from here.
Create a new virtual environment and install packages. conda create -n simpletransformers python pandas tqdm conda activate simpletransformers If using cuda: conda install pytorch cudatoolkit=10.0 -c pytorch else: conda install pytorch cpuonly -c pytorch conda install -c anaconda scipy conda install -c anaconda scikit-learn pip install transformers pip install seqeval pip install tensorboardx
Install Apex if you are using fp16 training. Please follow the instructions here. (Installing Apex from pip has caused issues for several people.)
Install simpletransformers. pip install simpletransformers

The Dataset

We will be using the Stanford Question Answering Dataset (SQuAD 2.0) for training and evaluating our model. SQuAD is a reading comprehension dataset and a standard benchmark for QA models. The dataset is publicly available on the website.

Download the dataset and place the files (train-v2.0.json, dev-v2.0.json) in the data/ directory.

Data Preparation

In order to perform QA in Simple Transformers, the data has to be in JSON files or in a Python list of dicts in the correct format.

If using JSON files, the files should contain a single list of dictionaries. A dictionary represents a single context and its associated questions.

Each such dictionary contains two attributes, the "context" and "qas".

context: The paragraph or text from which the question is asked.
qas: A list of questions and answers.

Questions and answers are represented as dictionaries. Each dictionary in qas has the following format.

id: (string) A unique ID for the question. Should be unique across the entire dataset.
question: (string) A question.
is_impossible: (bool) Indicates whether the question can be answered correctly from the context.
answers: (list) The list of correct answers to the question.

A single answer is represented by a dictionary with the following attributes.

answer: (string) The answer to the question. Must be a substring of the context.
answer_start: (int) Starting index of the answer in the context.

We can convert the SQuAD data into this format quite easily.

Question Answering Model

Simple Transformers has a class that can be used for each supported NLP task. An object of this class is used to perform training, evaluation (when ground truth is known), and prediction (when ground truth is unknown).

Here, we are creating a QuestionAnsweringModel object and setting the hyperparameters for fine tuning the model. The first parameter is the model_type and the second is the model_name.

The args parameter takes in an optional Python dictionary of hyper-parameter values and configuration options. I highly recommend checking out all the options here.

The default values are shown below.

To load a model a previously saved model instead of a default model, you can change the model_name to the path to a directory which contains a saved model.

model = QuestionAnsweringModel('bert', 'path_to_model/')

Training

Training the model is a one-liner! Just pass in train_data to the train_model function.

You can also change the hyperparameters by passing in a dict containing the relevant attributes to the train_model method. Note that, these modifications will persist even after training is completed.

The train_model method will create a checkpoint (save) of the model at every nth step where n is self.args['save_steps']. Upon completion of training, the final model will be saved to self.args['output_dir'].

Evaluation

The correct answers for the dev data are not provided in the SQuAD dataset but we can upload our predictions to the SQuAD website for evaluation. Alternatively, you could split the train data into training and validation datasets and use the model.eval_model() method to validate the model locally.

For this guide, I’ll simply be uploading the predictions to SQuAD.

Breaking down this code, we are reading in the dev data, converting it into the correct format, getting the model predictions, and finally writing to a JSON file in the required submission format.

Results

The results obtained with these hyperparameters are given below.

"exact": 67.24500968584182, 
"f1": 70.47401515405956, 
"total": 11873, 
"HasAns_exact": 64.1025641025641, 
"HasAns_f1": 70.56983500744732, 
"HasAns_total": 5928, 
"NoAns_exact": 70.3784693019344, 
"NoAns_f1": 70.3784693019344, 
"NoAns_total": 5945

SQuAD 2.0 is a challenging benchmark and this is reflected in these results. Some hyperparameter tuning should bump up these scores. Also, using a large model rather than a base model should significantly boost the results as well.