Building a chatbot using Azure OpenAI and langchain

Chatbot build using Azure openai and lanchain

In today’s digital age, chatbots have become an integral part of customer service, blah blah blah. Let’s cut to the chase and jump into coding.

Assumptions: You already have an openai model deployed in Azure so you must be having OPENAI_API_BASE, OPENAI_API_TYPE, OPENAI_API_VERSION and OPENAI_API_KEY, DEPLOYMENT and MODEL_NAME. If not follow any basic Youtube video and get it done.

I will be using Streamlit for frontend and python for backend, Langchain framework to utilize Openai APIs, Chroma vector database to store the documents.

Scope: Determining the scope of your chatbot project is a crucial initial step that sets the boundaries and objectives for its development. Try to list down the tasks and then in each task pen down sub tasks. For ex: if you want chatbot to answer about the products and your policies then your Tasks are 1. Product 2. Policies and your subtasks will be 1.1) Product 1 1.2) Product 2 2.1) Relocation Policy 2.2) Code of Conduct Policy

Defining the scope is important because the way Langchain works is that we create agents to perform the tasks. So one agent per task and if a task can further be divided into pretty much exclusive subtasks we can create separate agents for each of the sub task.

Requirements

!pip install streamlit_chat==0.1.1
!pip install streamlit==1.20.0
!pip install langchain==0.0.268
!pip install openai==0.27.8
!pip install hnswlib
!pip install chromadb==0.4.6
!pip install holidays==0.29
!pip install typing-inspect==0.9.0
!pip install typing_extensions==4.6.3
!pip install tiktoken==0.4.0
!pip install sentence_transformers==2.2.2
!pip install Wave
!pip install moviepy
!pip install SpeechRecognition

Data Collection The data for a chatbot could be in various formats including text files, pdf, docs, ppt, meeting transcribes, meeting recording, product demo videos, podcasts etc. Convert every format into text/pdf so it will be easy for us to read the data. .ppt/.docx can be exported to pdf. video can be converted to txt using code below. Audio podcast can also be converted to txt file using bottom half code below.

import os
import wave, math, contextlib
import speech_recognition as sr
from moviepy.editor import AudioFileClip

zoom_video_file_name = "./raw_data/Product_Demo_Recording.mp4"
transcribed_audio_file_name = "./raw_data/Product_Demo_Recording.wav"
transcribed_txt_file_name = "./cleaned_data/Product_Demo_Recording.txt"

file_name = os.path.basename(zoom_video_file_name)
file_name = os.path.splitext(file_name)[0]

audioclip = AudioFileClip(zoom_video_file_name)
audioclip.write_audiofile(transcribed_audio_file_name)
with contextlib.closing(wave.open(transcribed_audio_file_name,'r')) as f:
    frames = f.getnframes()
    rate = f.getframerate()
    duration = frames / float(rate)
total_duration = math.ceil(duration / 60)

print("Converting audio to speech...")
r = sr.Recognizer()
for i in range(0, total_duration):
    with sr.AudioFile(transcribed_audio_file_name) as source:
        audio = r.record(source, offset=i*60, duration=60)
    f = open(transcribed_txt_file_name, "a")
    f.write(r.recognize_google(audio, language = 'en-IN'))
    f.write(" ")
f.close()

If the data is in image format, convert it to text using pytesseract/paddleocr

import pytesseract
from PIL import Image
img = Image.open("./raw_data/work_schedule.png")
txt = pytesseract.image_to_string(img, lang='eng', config=r' --oem 1 --psm 1 -c preserve_interword_spaces=1')
f = open('./cleaned_data/work_schedule.txt', 'w')
f.write(txt)
f.close()

Store Data into the Database Once we have everything in .txt/.pdf format, read the files in python and store in database. There are 3 steps here: 1. Read the files using DocumentLoader 2. Since the file may have huge content/sentences/paragraph, its better to tokenize it 3. Once we have tokens which are also called splits in langchain terminology, convert the text into numerical vectors (embeddings) 4. Store the embeddings into the database so that when you ask a question, the database could return similar documents (tokens/splits), how ? convert the question into embedding using same function, return the documents which have the highest dot product value (least distance i.e. most similar).

The way I have done for my usecase is, I have stored each type of documents separately in separate vector database.

Example Usecase Lets take an example that you are a company which sells certain products online. You want your chatbot to do following: 1. when a user inquires question about a product, answer from a product_details file (excel/csv) 2. when user asks about return/exchange policy, answer accordingly. Lets generate the embeddings of this policy document and store in a vector_database.

from langchain.document_loaders import DirectoryLoader
txt_loader = DirectoryLoader(path = "./cleaned_docs/", glob = '*.txt')
docs = txt_loader.load()
# if you have pdf docs use below as well
pdf_loader = DirectoryLoader(path = "./cleaned_docs/", glob = '*.pdf')
docs += pdf_loader.load()

from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1000, #can be adjusted as per your need
    chunk_overlap = 150 #can be adjusted as per your need
)
splits = text_splitter.split_documents(docs)

from langchain.embeddings import HuggingFaceEmbeddings
embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

from langchain.vectorstores import Chroma
persist_directory = './return_exchange_policy/' #save the data to the disk at this location
vectordb = Chroma.from_documents(
    documents=splits,
    embedding=embedding,
    persist_directory=persist_directory
)
vectordb.persist() #to save it to the disk

I am using huggingface (sentence transformers: opensource/free) embedding. Other like openai/instruct etc. can also be used.

Tools Now comes the interesting part. The way I have done is that I create one tool for each task. > If you need Q&A feature, create a tool using RetrievalQA chain and give it access to vector database created in previous step. So this tool will have access to return_exchange policy

from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

# lets say this tool answer questions related to return policy
vectordb = Chroma(
    persist_directory=persist_directory, #read from the database
    embedding_function=embedding
)

retriever = vectordb.as_retriever(search_type="similarity", search_kwargs={"k":3})
# K can be increased if you need very detailed response for a question
# retriever_from_llm = MultiQueryRetriever.from_llm(
#    retriever=vectordb.as_retriever(k=3), llm=llm
# )

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever # use one of the above retriever or retriever_from_llm
)
qa_chain.run(question)

> If you want to read a csv/excel file and want your bot to answer from it, for ex: you have a file with product details and it has columns like product_description, configuration, color etc. and you want the bot to provide relevant answer to the user when it enquires about any product, we can create a PythonAstREPLTool tool like below

import pandas as pd
from langchain.tools.python.tool import PythonAstREPLTool

# python operations tool
# load product data in a dataframe
df = pd.read_csv("product_details.csv")
df_columns = df.columns()
product_tool = PythonAstREPLTool(locals={"df": df}) # provide python_repl tool access to the dataframe

> create as many tools as required (based on sub tasks) and then initialize an agent which has access to all the tools. Provide some relevant explanation of each tool when should it be used. You can also create tools which uses internet to perform web search (wikipedia/duckduckgo etc.), calculator to perform some calculation, also create custom tools if required.

all_tools = [
        Tool(
        name = "QA Chain", # can provide any name here
        func=qa_chain.run,
        description="""
        Useful for when you need to answer question related to return or exchange policy.\.
        Answer based on the Observation and return only what you know. \
        If you can not find the answer say you don't have the answer. \
        Do not include any information which is not provided to you.

        """
        ),
        Tool(
        name = "Product Info",
        func=product_tool.run,
        description = f"""
        Useful for when you need to answer questions about product information
        stored in pandas dataframe 'df'.\
        Run python pandas operations on 'df' to help you get the right answer.\
        'df' has the following columns: {df_columns}\
        
        <user>: What is the config of Kapple kiphone 22 pro?
        <assistant>: df[df['product_name'] == 'Kapple kiphone 22 pro']['configuration']
        <assistant>: Kapple kiphone 22 pro has following configuration.
        ...
        """
    ),

Agent will decide, with the help from LLM, based on the user input which is/are the best tools to use. (If more than one tool is required, it will use in sequence suggested by LLM). Refer my_agent.py below :

from langchain.chat_models import AzureChatOpenAI
llm = AzureChatOpenAI(deployment_name=OPENAI_DEPLOYMENT,
    openai_api_key=OPENAI_API_KEY,
    openai_api_base = OPENAI_API_BASE,
    openai_api_type = OPENAI_API_TYPE,
    openai_api_version = OPENAI_API_VERSION,
    model_name=OPENAI_MODEL_NAME,
    temperature=0.0 # can be increased upto 1 and tested
    )
# if you are using openai open platform model
# from langchain.chat_models import ChatOpenAI
# llm = ChatOpenAI(    
#     openai_api_key=openai_api_key, 
#     model_name=model_name, # "gpt-3.5-turbo", 
#     temperature=0.0
#     )

agent_kwargs = {'prefix': f'You are a friendly assistant. You are tasked to assist the user on questions related to return/exchange policy, \
                or Product information. You have access to the following tools:'}

# optional, will increase the cost but will be effective to memorize the context
# memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

agent = initialize_agent(all_tools, 
                         llm, 
                         agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, #'chat-copnversational-react-description' #try which is better
                         verbose=True,
                        #  memory=memory,
                         early_stopping_method="generate",
                         max_iterations= len(all_tools),
                         agent_kwargs=agent_kwargs
                         )

agent.run(user_query) # will provide the answer

User Interface Now since we have developed the backend, lets build a simple ui using Streamlit. Referring ui code from here. create app.py as below

import streamlit as st
import random
from streamlit_chat import message
from my_agent import agent

st.subheader("Dear customer. I am your customer officer bot.")
st.markdown("I can answer questions related to any product information, configuration, return and exchange policy.")

if "past" not in st.session_state:
    st.session_state["past"] = []
if "generated" not in st.session_state:
    st.session_state["generated"] = []
if "input_message_key" not in st.session_state:
    st.session_state["input_message_key"] = str(random.random())

chat_container = st.container()
user_query = st.text_input("Type your query and press <Send> button.", key=st.session_state["input_message_key"])

if st.button("Send"):
    response = agent.run(user_query)
    st.session_state["past"].append(user_query)
    st.session_state["generated"].append(response)
    st.session_state["input_message_key"] = str(random.random())
    # print('input mess key', st.session_state["input_message_key"])
    st.experimental_rerun()

if st.session_state["generated"]:
    with chat_container:
        for i in range(len(st.session_state["generated"])):
            message(st.session_state["past"][i], is_user=True, key=str(i) + "_user")
            message(st.session_state["generated"][i], key=str(i))

Conclusion I hope the above explanation/code would be helpful to you. Please note that this is only to help you start coding your own bot and you may need to work more when you develop it yourself. If you want to suggest some improvements please feel free to comment below. If you have any issues while creating your own bot, please write down, will try to answer if I can.

This is just 20% of entire chatbot product. Reach out to Crimsonlogic sales team for sale inquiries and demo. Thanks for reading! See you next time.