What Is Behind the Scene of A Chatbot NLU đ¤?
A few posts ago we introduced the basic components of a chatbot system. Today weâll dive a bit deeper into the nuts and bolts.
To review, the sequence of actions behind the scene look like the following:
How does the NLU component work?
Once the user message is received, there are a sequence of steps for the NLU engine to understand whatâs going on. They include:
- Tokenization
- Featurization
- Entity tagging
- Intent classification
Letâs review part of the dialogue from last time:
What are the options for each of these steps?
In some of the commercial systems, each of these components can be highly customizable. For instance, for tokenization, you may want to use the open-source NLP libraries like SpaCy, which supports tokenization of different languages; for featurizer, you can use various embedding choices including pre-trained models like BERT or your own domain fine-tuned representations; for entity tagging, SpaCy offers great NER tagging capabilities; for classification models, you can plug in sklearn classifiers, transformers, or in Rasaâs setting, they have a default joint classifier and tagger model called DIET (Dual Intent Entity Transformer).
How does the dialogue manager work?
Dialogue management systems typically can be composed using a combination of the following:
- Rules
- Models
Dialogue manager dictates the action of the bot. The designers of the bot usually have some expectation of what may be the âhappy pathsâ or common paths that things can go wrong. They can simply use IF THEN logic to encode these rules into the âaction planâ, such as the following example:
Another reason that rules are necessary is due to constraints for the purpose of a bounded customer experience. Customers donât want surprises. There are times that a customerâs inquiry has absolutely no ambiguity, e.g., âWhat is the current balance of my checking account ending in 0325?â In this case, thereâs no need to take the risk to predict an action that the bot should take using any probabilistic models.
On the other hand, a model based approach can be used otherwise. In the Rasa system, they offer a model called âTransformer-based sequence prediction modelâ (TED). Using our example dialogue , we can see that each botâs responses has an action label by itself. With rules and context, the dialogue manager can learn to predict the correct action at each response point. The arrows below illustrate where the dialogue manager needs to predict the next action.
In practice most bots are a mix of rules and ML based dialogue management.
This reminds me of the challenge in robotics to leverage natural language prompts and a NLU system to decide how to execute a task. The following is an example from Google Research.
The blue bar shows how likely the language model estimates the skill to be useful to the task at hand. The red bar shows how likely the system is to successfully execute a skill and the green bar shows the combined score used to finally select a skill to execute.
In other words, the following is how the robot might be thinking about this problem:
As you can see, where NLP and robotics cross paths is exactly âdialogue managementâ (if not more). Thatâs where actions are determined. A chatbot simply responds with natural language or takes a digital action, whereas a robot might take a physical action to fulfill the user request.
Lastly, the google research video is really short and worth a view:
https://www.youtube.com/watch?v=E2R1D8RzOlM&t=109s
Happy practicing!
Thanks for reading my newsletter. You can follow me on Linkedin!
Source of images and quotes: Stanford MLSys Seminars, ML Frameworks for Chatbots feat. Chris Kedzie | Stanford MLSys Seminar Episode 34 Rasa. https://rasa.com/