Summary

The website content provides an overview of the internal workings of chatbots, detailing the two main components: Natural Language Understanding (NLU) and Dialogue Management.

Abstract

The article "What Does A Chatbot 🤖 Look Like Under the Hood?" delves into the architecture of chatbots, emphasizing the importance of Natural Language Understanding (NLU) and Dialogue Management. NLU encompasses Intent Classification and Entity Extraction models, which are crucial for interpreting user inputs. Intent Classification categorizes user intents, while Entity Extraction identifies specific details within the user's message, such as spending categories and time frames. The article also touches on the complexities of entity linking, which disambiguates entities by connecting them to a knowledge base. Dialogue Management is responsible for orchestrating the bot's responses, whether through predefined choices or dynamic templates that interact with backend systems. The article promises future discussions on best practices, including fallback policies for when bots fail to understand user queries.

Opinions

The author suggests that chatbots require careful planning and design to handle expected user intents and to provide meaningful responses.
There is an acknowledgment of the active research in Named-Entity Recognition (NER) and its importance in chatbot interactions.
The article implies that entity linking is a non-trivial task, as evidenced by the author's personal experience with an interview question on the topic.
The author expresses that dialogue management is not just about generating responses but also about managing conversation flow and user experience, particularly in scenarios where the bot does not understand the user's input.
The author indicates a commitment to further exploration of chatbot technology, including more detailed discussions on entity linking, dialogue management policies, and fallback strategies in future posts.
The use of therapy bots like Woebot and Wysa is highlighted as an example where constrained response options are used to guide conversations effectively.

What Does A Chatbot 🤖 Look Like Under the Hood?

You probably have interacted with a chatbot here and there already, a quick poll:

POLL

Given the choice of a chatbot, would you try it out first or are you committed to call an agent?

1. Talk with the chatbot first! (type “1” in comment)

2. Absolutely call the human! (type “2 in comment)

So believers and non-believers, what does a chatbot look like under the hood?

At a high level, there are two main components in a chatbot: the first piece is called NLU or natural language understanding, and the second piece is called Dialogue Management.

Let’s look at a simple dialogue:

This simple dialogue has three user turns where the user first clicked on the “account balance” button, then asked a question, and finally made a comment at the end.

There are several machine learning (NLU) models that happened in this conversation. Let’s take a closer look:

Model #1 Classification Model

The first model involved is called Intent Classification. The purpose is to categorize user intent based on their responses. For those of you who use classifier models daily, sounds familiar?

You can think of these intents as some sort of labels for things the user wants to do. The designers of the bot would have to think of these ahead of time and come up with what are the expected intents they want to handle, how do they get expressed in natural language and then make sure that the bot can actually do something in response to that intent.

As you can see in the above example, the “account_balance” class is the easy one — we obtained the label trivially. The ML classifier is used for determining the intent of the question “how much did I spend on groceries last month?”. It was to issue a “spending query” against the database.

Model #2 Entity Extraction Model

However, just knowing that the user issued a “spending_query” is not enough to give the correct response back. The user wanted to know specifically about spending on “groceries”, and during the time span of “last month”. This information is called “entities”. In NLP, named-entity recognition (NER) is an active research area by itself. We’ll cover it in a later post!

A NER tagging model is needed to extract important entities and concepts from the utterances. In this example, we would like to distinguish the spending category of “groceries” from others, and narrow down the time range to be “last month”. With these information, the bot can then query the customer database and respond to the user with the right answer.

There’s one other complexity here. To resolve the ambiguity of entities like “last month”, the system needs to understand that today is a day in July, then “last month” means “June”. This is called “entity linking”. The purpose of entity linking is to disambiguate a named-entity to a knowledge base. I personally ran into an interview question about entity linking, will share in another post!

Dialogue management drives the bot’s responses in a holistic way. It tells the bot what to do (respond or take action) in a given situation. Sometimes responses are constrained to multiple choices like the first turn in the above example. These are typically used to navigate the conversation paths when we know our customers’ top intents, or when we would like to exercise a more stringent level of control in where the dialogue can go. For the second use case in the industry, I’ve seen these used dominantly in therapy bots like woebot and wysa.

Sometimes the dialogue manager responds with a template that requests data from the back end to fill that in, for instance, PII information (personal identifiable information) like account numbers are not really shown during the conversation time. Sometimes the dialogue manager creates various policies to handle different scenarios. For example, if a user types a question that the bot does not understand, usually the bot would respond with things like “Can you rephrase what you just asked?” or “Do you mean this or that?”. However, if the user has given a few tries and the bot still couldn’t understand it, we need to design an end to the loop so that the customer is not stuck forever there. This is an example of what we call a fallback policy. There are best practices of these fallback policies and we will cover it in a later post!

To conclude…

Today we talked about the high-level setup of a chatbot. We’ll write a series around the topic of chatbots.

Happy practicing!

Source of images and quotes: Stanford MLSys Seminars Episode 34, ML Frameworks for Chatbots feat. Chris Kedzie from Rasa

Thanks for reading my newsletter. You can also find the original post here, and follow me on Linkedin!