avatarQendel AI

Summary

This article guides on creating a Retrieval Augmented Generation (RAG) pipeline with Mistral 7B Instruct model, Chroma as a vector database, and Langchain as an orchestration framework.

Abstract

In this article, the author outlines a step-by-step guide for building a Retrieval Augmented Generation (RAG) pipeline with the Mistral 7B Instruct model. This pipeline is designed to improve the performance of LLM chains by incorporating context from a vector database using Chroma. The author provides a walkthrough of the process, beginning with creating an LLM Chain instance and installing necessary libraries. They also demonstrate how to obtain and chunk documents, initialize an embedding model, create a vector store instance, and finally, create and query a question-answering (QA) chain. The author presents several experiments to demonstrate the effectiveness of the Mistral 7B Instruct model in answering questions related to the provided context, and concludes with plans for future improvements, including fine-tuning the model for better performance.

Bullet points

  • Create an LLM Chain instance
  • Install and Import Libraries
    • !pip install -U sentence-transformers chromadb
    • Libraries: langchain, sentence-transformers, chromadb
  • Get Experimentation Documents
    • Previous articles on persona prompt pattern and audience persona pattern
  • Chunk Documents
    • Using Langchain’s recursive character splitter for chunking
    • Create Document object from text documents
    • Split documents into chunks with RecursiveCharacterTextSplitter
  • Download an Embedding Model
    • Using the best embedding model on Huggingface’s embedding leaderboard
  • Initiate a Vector Store Instance
    • Initiate Chroma DB as a vector store for this experiment
    • Create a retriever from the Chroma DB instance
  • Create Your Question Answering (QA) Chain
    • Choose between an out-of-the-box QA chain or a customizable QA chain with a custom prompt
    • Create a customizable QA chain using the RetrievalQA class, passing in the LLM and retriever objects and a custom prompt
  • Query Mistral 7B Instruct Model
    • Define the question to ask the LLM
    • Query the LLM using the customizable QA chain, passing in the question as a dictionary
  • Future Improvements
    • Fine-tune Mistral 7B Instruct model to improve performance
    • Future article on fine-tuning Mistral 7B Instruct model

The article presents an example of using Mistral 7B Instruct model to answer questions about the provided context and outlines the steps needed to create a RAG pipeline with this model, including creating an LLM chain, chunking documents, downloading an embedding model, initiating a vector store instance, creating a QA chain, and querying the LLM. The author plans to write a future article on fine-tuning Mistral 7B Instruct model to further improve its performance.

RAG Pipeline with Mistral 7B Instruct Model in Colab: A Step-by-Step Guide

Mistral 7B Instruct model RAG Pipeline. Source: Created by the author.

Recent Large Language Models (LLMs) deliver substantial value, especially when answering questions related to your data.

In this article, I will show you exactly how I created a Retrieval Augmented Generation (RAG) pipeline with:

  • Mistral 7B Instruct as a reasoning model,
  • Chroma as a vector database,
  • Langchain as an orchestration framework.

Building RAG Pipeline with Mistral 7B Instruct Model

Step 1:

Create an LLM Chain

I have already discussed how to get started with Mistral 7B and create an LLM Chain instance. Please start with my previous article first.

Step 2

Install and Import Libraries.

Ensure you’ve installed most libraries in Step 1. The following list provides additional libraries only.

!pip install -U sentence-transformers chromadb
from langchain.docstore.document import Document 
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings

Step 3

Get Experimentation Documents.

For this experiment, I will leverage my two previous articles on Persona Prompt Pattern and Audience Persona Pattern.

persona_prompt = """
As crazy as it sounds, a simple 5-word prompt prefix helps ChatGPT think like any professional and enables you to find better responses on the first try.

To my surprise, I pinpointed the issue with my car by having ChatGPT act and think like a mechanic last month. This means people can now use ChatGPT to think like you and do what you do for them, especially if your task is not overly sophisticated. You can also use it to make your life easier — I will show you how and when in this article.

Curious about how this magic works?

It is called the Persona prompt pattern. Here’s its basic form:

Act as a {Profession/Celebrity/…/Fictional Character},

Do {Task Name}

Examples:

Act as a personal stylist. I need advice on what to wear to a job interview in the creative industry.
Act as a financial advisor. Given my income and expenses, provide me with a personalized budget plan for the next six months.
Now that we have seen the basic persona prompting form, these are 3 scenarios when you should put persona prompt to work:

1. When you need a customized perspective
Code reviewer
Financial analyst
Historical expert
Data Scientist
Etc.
These aren’t just job titles but roles ChatGPT can act as to provide viable solutions to your problems.

Throw one-size-fits-all responses out the window — use Persona Prompt Pattern to mould ChatGPT response to fit a specific point of view. Imagine ChatGPT teaching you about taxes like a tax expert or giving you business tips like an expert who run a business for 10 years would.

It is not just about getting information anymore but about getting information with character.

Example:

Act as a business expert. I’m planning on launching a startup in the e-commerce sector. What are some challenges and tips I should consider?

2. When you know ‘who’ you need but not ‘what’ you need.
Most of us almost always know ‘who’ we need help from better than the ‘what’ we need.

When was the last time you thought a mechanic would help you with your car issue?

All the time.

When was the last time you knew your car issue was the compressor?

Only sometimes.

With the Persona Pattern, you only need to know ‘who’ can help you do the task.

Example:

Act as an interior designer. I’ve moved into a new apartment, and it’s too bland. Can you give me some design ideas?

3. When you want to interact with ChatGPT in an “Adaptive,” not “rigid” manner
Most people have ‘rigid’ interactions with ChatGPT.

How so?

People plainly ask ChatGPT and get on with their day with basic responses. But there is more for you if you ask better. You can shape your next interaction with ChatGPT “adaptive” — if you provide clear guidance.

Sounds intriguing?

Just look at these levels of interaction with ChatGPT:

Rigid Interaction: You plainly ask a question, and ChatGPT “responds” based on a General Persona

Example:

Can you give me a vegetarian recipe?

Flexible interaction (with limited adaptability): You introduce parameters into your question (e.g., behavior or profession), and ChatGPT “adapts” based on your specific input.

Example:

I have a friend coming over who is a vegetarian. Can you suggest a special dish?

Persona Pattern: You instruct ChatGPT to act and think like a persona and respond with the persona’s perspective in an “adaptive” manner.

Example:

Act as a gourmet chef. I have a friend coming over who is a vegetarian. I want to impress my friend with a special vegetarian dish. What do you recommend?

Rigid < Flexible < Persona Prompt.

Interact with flexibility and adaptability in mind.

Limitations of Persona Prompt
Sometimes, the persona pattern might make interesting assumptions and hallucinations. It is tough to spot the assumptions if we know little about the persona’s expertise.

Be specific with your questions to avoid unnecessary assumptions.

Conclusion
You can get much better responses from ChatGPT by simply asking better. You can access tailored expertise at your fingertips using the Persona Prompt pattern. Whether seeking insights from a seasoned financial analyst or fashion advice tailored to creative job interviews, the Persona Prompt is your ticket to open the next door for you.

Thanks for reading. 
"""

audience_persona = """
Stop throwing your words to the wrong ears!

Imagine you have a well-crafted message but throw it into diverse minds — people of different ages, professions, upbringings, etc.

The Brutal Truth?

Your well-thought-out message means NOTHING if it doesn’t resonate with the audience.

For Example, how would you explain AI to

A 5-year-old child?
A painter?
A Doctor?
An Engineer?
A Pirate?
People from the 18th century?
This is where ChatGPT can help you — tailoring your message to resonate with a specific audience.

In my last article about Persona Prompt Pattern, we looked into how to use ChatGPT to act and think like a persona, but now we go to the other side. We will see how ChatGPT tailors its response to a particular audience.

The “Audience Persona Pattern”

The Audience Persona Pattern uses the audience’s knowledge to tailor the ChatGPT’s response — so the same response can be tailored based on the audience.

Interesting?

Here’s the Audience Persona Pattern basic form

[Your Question]. Assume I am {Audience Persona}

Examples:

Explain to me how Digital Currency works. Assume that I am a 5-year-old child.
Explain how the car engine works. Assume I am a painter.
Explain the concept of Artificial Intelligence to a Doctor.
Now let’s see how ChatGPT explains what Digital Currency is to the above personas — and compare the results.

Question: Explain to me how Digital Currency works.

Audience Personas:

A 5-year-old child.

ChatGPT explaining Digital Currency to a 5-year-old child
2. A Painter


ChatGPT explaining Digital Currency to a Painter
3. A Doctor


ChatGPT explaining Digital Currency to a Doctor
Conveying Your Crafted Message to an Audience
Suppose you want to convey a specific message to a particular audience. In that case, you can copy and paste it into ChatGPT and ask it to explain it to a particular persona.

Basic form:

[PASTE YOUR MESSAGE HERE]. Explain this to [AUDIENCE PERSONA]

Example:

Message: Description of how Large Language Models Work


Asking ChatGPT to tailor my message to a 5-year-old child
Response: tailored message to a 5-year-old child.


A message tailored for a 5-year-old child
See how ChatGPT tailors your message and uses the audience’s familiar language and examples.

You can do it, too. You can utilize the Audience Persona Pattern to tailor your message to resonate with your audience.

Remember,

If you speak to all, you reach to none.

Happy Tailoring 

Thanks For Reading
"""

Step 4

Chunk Documents

I will use Langchain’s recursive character splitter for chunking.

# Create Document object from text documents 
docs = [Document(page_content=post) for post in [persona_prompt, audience_persona]]

# Split documents into chunks 

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500, chunk_overlap=10, separators=['\n\n', '\n', '.']
)

document_chunks = text_splitter.split_documents(docs)

Step 5:

Download an Embedding Model

I am using the best embedding model on Huggingface’s embedding leaderboard. Feel free to use another.

# Downloading embedding model 
embedding_model = SentenceTransformerEmbeddings(model_name='BAAI/bge-large-zh-v1.5')

Step 6

Initiate a Vector Store Instance

Initiating Chroma DB as a vector store for this experiment.

# Initiate a chromadb instance 

chroma_db = Chroma.from_documents(document_chunks, embedding_model)

retriever = chroma_db.as_retriever()

Step 7

Create Your Question Answering (QA) Chain

We have two options to create the QA Chain (use only one):

Option 1: An out-of-the-box QA chain you can readily use

# Option 1 
qa_chain = RetrievalQA.from_chain_type(llm,retriever=retriever)

Option 2: A customizable QA chain that you can use your own prompt and parameters

# Prompt template 
qa_template = """<s>[INST] You are a helpful assistant.
Use the following context to Answer the question below briefly:

{context}

{question} [/INST] </s>
"""

# Create a prompt instance 
QA_PROMPT = PromptTemplate.from_template(qa_template)

# Custom QA Chain 
qa_chain = RetrievalQA.from_chain_type(
    llm,
    retriever=retriever,
    chain_type_kwargs={"prompt": QA_PROMPT}
)

Final Step:

Query Mistral 7B Instruct Model

# Your Question 
question = "YOUR QUESTION HERE"

# Query Mistral 7B Instruct model
response = qa_chain({"query": question})

# Print your result 
print(response['result'])

🧠 Review

Here are some questions I asked Mistral 7B Instruct model based on my documents, along with their responses:

Experiment 1:

question = "What is the basic form of a persona prompt pattern"

Response: 
' The basic form of a persona prompt pattern is Act as a '
 '{Profession/Celebrity/…/Fictional Character}, Do {Task Name}. Examples '
 'include "Act as a chef, Cook a meal for a family of four" or "Act as a '
 'detective, Solve a murder mystery".'

Experiment 2:

question = "What is the basic form of audience persona prompt"

response: 
'[Your Question]. Assume I am {Audience Persona}.'

Experiment 3:

question = "what are the limitations of persona pattern?"

response:
'\nAs an assistant, I can tell you that there are several limitations to using the persona prompt.'

Overall, the Mistral 7B Instruct model provided great responses but not perfect yet. Try it out for your use case, and let us know how it worked for you in the comments.

What’s Next

The Mistral 7B Instruct model is helpful but not perfect for all use cases yet — it needs fine-tuning to boost performance.

Here’s my article to fine-tune Mistral 7B Instruct model in 9 easy steps:

🎖️Thanks For Reading🎖️=

⚡️LIGHT UP⚡️ this article with a C-L-A-P👏

🚀 F-O-L-L-O-W Qendel AI for upcoming content👇

💡 Fine-tuning Mistral 7B Instruct model

💡 Evaluate Fine-tuned Mistral 7B Instruct Model

Large Language Models
Data Science
Machine Learning
NLP
Prompt
Recommended from ReadMedium