How to Integrate Alan’s Speech-to-Text Engine into Doctor.ai
Improve Doctor.ai’s speech recognition with the highly accurate Alan AI
By Sixing Huang and Liang Li

For a voice chatbot, its ability to accurately capture the speaker’s utterances can make or break the user experience. An intelligent transcriber is always fun to talk to. And it is a productivity boost because speaking is about two to three times faster than typing.
In contrast, a choppy speech-to-text engine, such as the one from Chrome, can quickly frustrate users. In this case, the user will soon realize that the time spent in the correction is more than the time gained from the dictation. As a result, he will always type and not speak. And that defeats the whole purpose of a voice chatbot.
During the development of our medical chatbot Doctor.ai (1, 2, 3, 4, 5, and 6), we have noticed early on that the speech-to-text engine in Chrome is not great. It dropped and misinterpreted words. It formed incoherent sentences that make no sense. Its performance in medical conversations, which Doctor.ai needed the most, was abysmal.
We have been on the hunt for a better engine. This new engine should not only excel in normal conversations but also understand the common medical jargon, such as names of diseases, drugs, and pathogens. And our search has been rewarded with the Alan AI.



According to its website, Alan is a conversational voice AI platform. Its Spoken Language Understanding (SLU) is designed to process the error-prone output of Automatic Speech Recognition (ASR). And it has a so-called Domain Language Model to better recognize the specialized language, dynamically adapting to users’ conversational style.
We are impressed by Alan’s highly accurate voice capturing (Figure 1). It outperforms Chrome entirely in both normal and technical conversations. It sailed through many biomedical jargons such as “photosynthesis”, “frontal sinus”, “Doxepin” and “cowpox” (Figure 2). What amazes us the most is that, as you can see in Figure 1, even though Alan got some wrong words here and there, it was able to correct the mistakes and form coherent sentences in the end.

In this article, we are going to show you how to integrate the Alan button into Doctor.ai’s frontend to improve user experience (Figure 3). It understands English. If you have the Enterprise version of Alan, you can make a German version, too. The project does not cost money with Alan’s Developer Plan. The code for this project is hosted on the GitHub repository here.
1. Get Alan’s SDK key
Go to Alan’s website and sign up. And get the “Developer Plan” by following the instructions. This plan gives you over ten thousand free interactions in Alan. Once you are in the Alan Studio, click Create Voice Assistant and name it like doctorai_en.


Once inside the Studio, create an “Alan Integrations” Project. Delete all the original scripts on the left panel and create a new one. Copy and paste Code 1 into the content. This script captures the user’s speeches and silences Alan’s voice responses. Finally, click the </> Integrations button.







