
LANGCHAIN — Is Langchain Predibase the Easiest Way to Fine-Tune and Productionize OSS LLMS?
I’m not a great programmer; I’m just a good programmer with great habits. — Kent Beck
Langchain Predibase: The Easiest Way to Fine-Tune and Productionize OSS LLMS
Langchain has recently integrated with Predibase, a developer platform for OSS LLMs, to provide a seamless experience for Langchain developers to integrate hosted OSS models into their workflows. This integration allows builders to deploy and query pre-trained or custom open source LLMs without hassle and fine-tune their own LLMs in just a few lines of code, among other features.
Predibase is built on top of open-source foundations including Ludwig and Horovod, and supports multiple interfaces including a UI and a Python SDK, making it accessible to users of all levels. In addition, the platform can be deployed on Predibase-managed infrastructure or securely inside your own Cloud VPC.
To get started with Langchain and Predibase, follow these simple steps:
- Sign up for Predibase for free at https://predibase.com/free-trial.
- Create an account.
- Go to Settings > My profile and generate a new API token.
Below is an example of a simple Q/A system you can build using Langchain and Predibase-hosted LLMs.
pip install predibase
pip install langchain
pip install chromadb
pip install sentence_transformers
# Replace with your Predibase API Token
import os
os.environ[“PREDIBASE_API_TOKEN”] = “{PREDIBASE_API_TOKEN}”
from langchain.document_loaders import WebBaseLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.llms import Predibase
# Document loader
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()
# Split into Chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 0)
all_splits = text_splitter.split_documents(data)
# Store Embeddings in Chroma
from langchain.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings
vectorstore = Chroma.from_documents(documents=all_splits,embedding = HuggingFaceEmbeddings())
# Pull in any LLM from Predibase, including fine-tuned LLM’s
llm = Predibase(model="llama-2-13b", predibase_api_key=os.environ.get("PREDIBASE_API_TOKEN"))
# Fetch relevant chunks into LLM
from langchain.chains import RetrievalQA
qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
qa_chain({"query": question})This example demonstrates how to use Langchain and Predibase to build a simple Q/A system. By following the steps outlined and using the provided code snippets, you can leverage the capabilities of Langchain and Predibase to create your own applications and workflows.
We’re excited to make these capabilities available for all Langchain users and look forward to seeing what the community builds. If you have ideas, comments, or questions, feel free to reach out on the LangChain Discord or via [email protected].
