The article discusses simple intent recognition and question answering using the DeepPavlov framework, which offers text classification solutions for limited training data scenarios, particularly for autoFAQ models.
Abstract
The article introduces the DeepPavlov framework as a solution for text classification problems commonly encountered in customer support scenarios, such as recognizing user intent in chatbots or classifying FAQ questions. It emphasizes the framework's ability to handle limited training data effectively, providing a cost-effective alternative to hiring an NLP engineer or using a black-box API service. DeepPavlov, which leverages TensorFlow and Keras, allows users to fine-tune hyperparameters and test various models. The article also demonstrates how to perform text classification with DeepPavlov using a student FAQ dataset, highlighting the ease of interaction with pretrained models through both Python code and the command line. Additionally, the article covers the model description, evaluation, and concludes with the effectiveness of DeepPavlov's models for small datasets, suggesting that more sophisticated deep learning models could be applied with larger datasets.
Opinions
The article suggests that DeepPavlov's text classification models are particularly useful when dealing with a limited pool of predefined questions, as is common in customer support.
It is implied that hiring an NLP engineer for custom solutions can be costly, and black-box API services may lack the desired flexibility for fine-tuning models.
The article conveys that DeepPavlov's open-source nature and pretrained models offer a balance between cost and flexibility, making it an attractive option for businesses and developers.
The use of leave-one-out cross-validation (LOOCV) for model evaluation indicates a preference for this method when dealing with small datasets.
The author expresses that the fastText mean-based model is superior for datasets with rich lexical variability, while the tf-idf logistic regression model is preferred among tf-idf-based models.
The article concludes with an endorsement of DeepPavlov's models for scenarios with limited data, while acknowledging the potential of more sophisticated models when larger datasets are available.
Simple intent recognition and question answering with DeepPavlov
Let’s get straight to the point. Suppose your clients are likely to ask some of a very limited pool of questions. You can address these questions either via a customer support call center, or a chat widget on the webpage. In both of these cases, for user input, you should decide if it’s semantically similar to one of the predefined questions and then return the corresponding answer. Formally speaking, this is a text classification problem. Text classification is one of the widely used tasks in the field of natural language processing (NLP).
Text classification can solve the following problems:
Recognize a user’s intent in any chatbot platform.
Distinguish between spam and nonspam messages.
Identify the sentiment of client reviews.
Classify a product item into one or more product types from a catalog.
If you are faced with one of these problems, there are several solutions at your disposal. You can hire an NLP engineer to collect the training data for you, train, fine-tune, and retrain the model until its performance satisfies you. Alternatively, you can employ one of the NLP API services to do the job for you. Both solutions have advantages and disadvantages. An NLP engineer is costly, and a black box API service might not be able to provide you with the flexibility to test several models and fine-tune all the parameters you want.
Fortunately, the open-source conversational AI framework DeepPavlov offers a free and easy-to-use solution. DeepPavlov comes with a bunch of predefined components powered by TensorFlow and Keras for solving NLP-related problems, including text classification. The framework allows you to fine-tune hyperparameters and test several models.
Next, I would like to show how you can do text classification with the DeepPavlov framework, particularly I will focus on the case when the training data is limited. The popular usage scenario for these models is to classify a user utterance into one of the FAQ questions and retrieve the corresponding answer (autoFAQ models). The code used in this article can be accessed on Colaboratory via the link.
Text classification with DeepPavlov
We use the student FAQ as a dataset for demonstration. The FAQ consists of student questions with corresponding answers. Here is a sample data file.
Question,Answer
How toget a bank card?,Visit the social service onthesecond floor ofthe building housing the dining hall. The social service is next tolocal internal affairs office.
The DeepPavlov framework already contains pretrained models for classifying these questions. You can interact with the pretrained model either by running a Python code or via the command line. The interaction with the model via the command line is illustrated by the snippet below. But first install all the model requirements.
q::I need help
>> If you have any further inquiries, you can address them tothe International Students Office, which is located inthe Auditorium Building, Room 315. The phone number is (7-495) 408-7043.
Where tfidf_logreg_en_faq is the model’s configuration file and
-dindicates that all model-related data is to be downloaded.
Alternatively, you can interact with the model by running the following Python code. In addition, you can navigate the configuration files by using Autocomplete (Tab key) on configs.
Model description
DeepPavlov contains several text classification models that work well on few training pairs. The models are described in the separate configuration files under the config/faq folder.
The config file consists of four main sections: dataset_reader, dataset_iterator, chainer, and train. The dataset_reader defines the dataset’s location along with the dataset format (x_col_name, y_col_name). After loading, the data is split into the train, validation, and test sets according to the dataset_iterator settings.
The chainer section of the configuration files consists of three subsections. The in and out sections define an input and an output to the chainer, whereas the pipe section defines a pipeline of the required components to interact with the models — i.e., the tokenizer, tf-idf vectorizer, and others. The tokenizer splits a string into tokens. The tf-idf vectorizer transforms the tokens into tf-idf vectors. On the configuration below, the tokenizer with the lemmatizer enabled (lemmas: true) divides an input question into tokens and converts tokens into lemmas, then stores an output in q_token_lemmas. The next component, fasttext,loads fastText embeddings (from the load_path file) and converts all the q_token_lemmas lemmasinto word vectors. As an output, it calculates the mean of all word vectors and stores it in out. Finally, the cos_sim_classifier component is able to provide top_n candidates from the training data fit_on on our input q_vect using the cosine similarity measure.
You can train a model by running it with train parameter, the model will be trained on the dataset defined in the dataset_reader section of the configuration file. The DeepPavlov framework allows you to test all the available models on your data in order to identify the best-performing model. To test the model, specify the dataset split along with split fields in the dataset_iterator section of the configuration file. In addition, you should define the measured metrics in the train section as the following
Alternatively, you can train the model by running the following Python code.
Model evaluation
Model performance was measured on the FAQ dataset (with manually added paraphrases for each question). Due to the limited number of the question-answer pairs, we measure performance by using leave-one-out cross-validation (LOOCV).
All the models are based on two major text representations: fastText word embeddings and tf-idf representation.
The fastText model (fasttext_avg_autofaq.json) is a popular approach that averages fastText word embeddings and assigns the label of the closest utterance from the training set according to cosine distance. The tf-idf model (tfidf_autofaq.json) uses the tf-idf representation of the utterances; then, as the previous model, it leverages cosine distance to assign a label. A hybrid fastText tf-idf weighting model (fasttext_tfidf_autofaq.json) weights the fastText word embeddings by tf-idf values and also uses the cosine similarity approach. Finally, the tf-idf logistic regression model (tfidf_logreg_autofaq.json) trains a logistic regression on the tf-idf representation of the input.
The results in the sorted order are presented in Table 1.
The fastText mean based model outperforms all tf-idf based models by a large margin, this result can be caused by rich lexical variability of the dataset. The tf-idf logistic regression that learns to assign weights to the words outperforms the rest tf-idf based models.
Conclusion
In this article, I described the text classification models of the DeepPavlov framework. The relevant code can be found in the Colab notebook. These models were specifically developed to be effective for a small training dataset. However, if a large enough dataset is available, more sophisticated deep learning models can be applied.