What Does A Chatbot 🤖 Look Like Under the Hood?
You probably have interacted with a chatbot here and there already, a quick poll:
POLL
Given the choice of a chatbot, would you try it out first or are you committed to call an agent?
1. Talk with the chatbot first! (type “1” in comment)
2. Absolutely call the human! (type “2 in comment)
So believers and non-believers, what does a chatbot look like under the hood?
At a high level, there are two main components in a chatbot: the first piece is called NLU or natural language understanding, and the second piece is called Dialogue Management.
Let’s look at a simple dialogue:

This simple dialogue has three user turns where the user first clicked on the “account balance” button, then asked a question, and finally made a comment at the end.
There are several machine learning (NLU) models that happened in this conversation. Let’s take a closer look:
Model #1 Classification Model
The first model involved is called Intent Classification. The purpose is to categorize user intent based on their responses. For those of you who use classifier models daily, sounds familiar?

You can think of these intents as some sort of labels for things the user wants to do. The designers of the bot would have to think of these ahead of time and come up with what are the expected intents they want to handle, how do they get expressed in natural language and then make sure that the bot can actually do something in response to that intent.
As you can see in the above example, the “account_balance” class is the easy one — we obtained the label trivially. The ML classifier is used for determining the intent of the question “how much did I spend on groceries last month?”. It was to issue a “spending query” against the database.
Model #2 Entity Extraction Model
However, just knowing that the user issued a “spending_query” is not enough to give the correct response back. The user wanted to know specifically about spending on “groceries”, and during the time span of “last month”. This information is called “entities”. In NLP, named-entity recognition (NER) is an active research area by itself. We’ll cover it in a later post!

A NER tagging model is needed to extract important entities and concepts from the utterances. In this example, we would like to distinguish the spending category of “groceries” from others, and narrow down the time range to be “last month”. With these information, the bot can then query the customer database and respond to the user with the right answer.
There’s one other complexity here. To resolve the ambiguity of entities like “last month”, the system needs to understand that today is a day in July, then “last month” means “June”. This is called “entity linking”. The purpose of entity linking is to disambiguate a named-entity to a knowledge base. I personally ran into an interview question about entity linking, will share in another post!

Dialogue management drives the bot’s responses in a holistic way. It tells the bot what to do (respond or take action) in a given situation. Sometimes responses are constrained to multiple choices like the first turn in the above example. These are typically used to navigate the conversation paths when we know our customers’ top intents, or when we would like to exercise a more stringent level of control in where the dialogue can go. For the second use case in the industry, I’ve seen these used dominantly in therapy bots like woebot and wysa.
Sometimes the dialogue manager responds with a template that requests data from the back end to fill that in, for instance, PII information (personal identifiable information) like account numbers are not really shown during the conversation time. Sometimes the dialogue manager creates various policies to handle different scenarios. For example, if a user types a question that the bot does not understand, usually the bot would respond with things like “Can you rephrase what you just asked?” or “Do you mean this or that?”. However, if the user has given a few tries and the bot still couldn’t understand it, we need to design an end to the loop so that the customer is not stuck forever there. This is an example of what we call a fallback policy. There are best practices of these fallback policies and we will cover it in a later post!
To conclude…
Today we talked about the high-level setup of a chatbot. We’ll write a series around the topic of chatbots.
Happy practicing!

Source of images and quotes: Stanford MLSys Seminars Episode 34, ML Frameworks for Chatbots feat. Chris Kedzie from Rasa
Thanks for reading my newsletter. You can also find the original post here, and follow me on Linkedin!






