
LANGCHAIN — Deconstructing RAG
Software is a great combination between artistry and engineering. — Bill Gates
RAG, or retrieval augmented generation, is a crucial concept in large language model (LLM) app development. It involves loading retrieved information into the context window of an LLM for output generation. This article deconstructs RAG into several key themes and provides examples of code snippets for each theme.
Query Transformations
Query expansion
from langchain.retrievers import MultiQueryRetriever
# Perform sub-question generation
multi_query_retriever = MultiQueryRetriever()
sub_questions = multi_query_retriever.generate_sub_questions("Who won a championship more recently, the Red Sox or the Patriots?")Query re-writing
from langchain.rewriters import RewriteRetrieveRead
# Rewrite user questions
rewriter = RewriteRetrieveRead()
rewritten_question = rewriter.rewrite_question("Poorly framed user question")Query compression
from langchain.compression import CompressChatHistory
# Compress chat history into a final question for retrieval
compressor = CompressChatHistory()
final_question = compressor.compress_history(chat_history)Routing
from langchain.routing import DynamicQueryRouting
# Dynamic query routing
router = DynamicQueryRouting()
route = router.route_query(query)Query Construction
Text-to-SQL
from langchain.query_construction import TextToSQL
# Translate natural language into SQL requests
text_to_sql = TextToSQL()
sql_query = text_to_sql.convert_to_sql(natural_language_question, table_info)Text-to-Cypher
from langchain.query_construction import TextToCypher
# Translate natural language into Cypher queries
text_to_cypher = TextToCypher()
cypher_query = text_to_cypher.convert_to_cypher(natural_language_question)Text-to-metadata filters
from langchain.query_construction import MetadataFiltering
# Translate natural language into structured queries with metadata filters
metadata_filter = MetadataFiltering()
structured_query = metadata_filter.construct_query(natural_language_question, metadata_fields)Indexing
Chunk size
from langchain.indexing import ChunkSizeTuning
# Experiment with chunk sizes for document embedding
chunk_tuner = ChunkSizeTuning()
performance_boost = chunk_tuner.test_chunk_sizes(document, chunk_sizes)Document embedding strategy
from langchain.indexing import DocumentEmbedding
# Decouple document embedding for retrieval and answer synthesis
embedding = DocumentEmbedding()
retrieval_embedding = embedding.embed_for_retrieval(document)
synthesis_embedding = embedding.embed_for_synthesis(document)Post-Processing
Re-ranking
from langchain.postprocessing import ReRanking
# Re-rank retrieved documents to reduce redundancy
reranker = ReRanking()
reranked_documents = reranker.rerank_documents(retrieved_documents)Classification
from langchain.postprocessing import DocumentClassification
# Classify retrieved documents for logical routing
classifier = DocumentClassification()
classified_documents = classifier.classify_documents(retrieved_documents)This deconstruction of RAG provides a comprehensive understanding of the key components involved in harnessing retrieval augmented generation in LLM applications. By breaking down the concepts and providing code examples, developers can gain valuable insights into implementing RAG in their own projects.






