
LANGCHAIN — What Is OpenAIs RAG and How Can It Be Applied?
Any fool can write code that a computer can understand. Good programmers write code that humans can understand. — Martin Fowler
OpenAI’s RAG (Retrieval-Augmented Generation) is a framework that combines information retrieval and language generation to improve natural language understanding models. RAG can be applied in various ways to enhance the performance of language models in different applications. Below, let’s explore how you can implement some of the RAG strategies mentioned in OpenAI’s demo day.
Baseline
Distance-based vector database retrieval embeds queries in high-dimensional space and finds similar embedded documents based on “distance”. One of the base-case retrieval methods used in the OpenAI study is cosine similarity. Below is an example of how to implement cosine similarity for vector similarity search using Python and the Annoy library:
from annoy import AnnoyIndex
# Example code for creating and querying an Annoy index
# Create an index
f = 40
t = AnnoyIndex(f, 'angular') # Length of item vector that will be indexed
t.add_item(0, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39])
t.build(10) # 10 trees
t.save('test.ann')
# Query the index
u = AnnoyIndex(f, 'angular')
u.load('test.ann') # super fast, will just mmap the file
print(u.get_nns_by_vector([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39], 3))Query Transformations
Query transformations modify user input to improve retrieval. Here’s an example of implementing query expansion using LangChain’s Multi-query retriever in Python:
from langchain import MultiQueryRetriever
# Create a MultiQueryRetriever
retriever = MultiQueryRetriever()
# Generate multiple queries
queries = retriever.generate_queries("user input query")
# Retrieve relevant documents for each query
relevant_documents = [retriever.retrieve(query) for query in queries]
# Take the unique union across all queries
unique_documents = set().union(*relevant_documents)Routing
Routing questions to appropriate sources becomes critical when querying across multiple datastores. The following code snippet showcases routing using LangChain’s LLM to gate user input into a set of defined sub-chains:
from langchain import LLM
# Define sub-chains
sub_chains = {'chain1': ['source1', 'source2'], 'chain2': ['source3', 'source4']}
# Gate user-input into sub-chains
result = LLM.gate_user_input(user_input, sub_chains)These are just a few examples of how you can apply OpenAI’s RAG strategies in your own applications. Each method can be implemented using different tools and libraries depending on your specific requirements and use case.
Conclusion
Implementing RAG strategies in your applications is crucial for improving language model performance. However, it’s important to evaluate the effectiveness of these methods to avoid wasted time and effort. LangSmith offers great support for RAG evaluation, providing a variety of tools to assess the performance of advanced RAG chains.
By understanding and implementing these methods, you can tailor RAG strategies to your specific application, ultimately achieving better natural language understanding and generation.
