Use Your Locally Stored Files To Get Response From GPT like ChatGPT

Summary

The article provides a technical guide on using locally stored text files to generate responses from GPT-3, similar to ChatGPT, utilizing Python and libraries such as OpenAI, Langchain, and Chroma.

Abstract

The article by Shweta Lodha, titled "Use Your Locally Stored Files To Get Response From GPT like ChatGPT | Python," demonstrates a method for interacting with text files stored on a local machine to obtain responses akin to those from ChatGPT. It leverages the OpenAI API, specifically GPT-3, in conjunction with the Langchain and Chroma Python libraries. The process involves importing necessary packages, setting up the OpenAI API key, loading text files from a local directory, splitting the data into manageable chunks, converting the text into vector embeddings, and finally creating and validating a model to respond to queries. The author emphasizes the use of a DirectoryLoader to handle various file types and a CharacterTextSplitter for dividing long texts. The article also includes a step-by-step guide on how to install the required packages, set up the environment, and test the model with a sample question about the effects of homelessness.

Opinions

The author suggests that using locally stored text files for generating GPT-3 responses can be an effective way to interact with data.
They imply that the Langchain and Chroma libraries are essential tools for working with GPT-3 in Python.
The article promotes the idea that splitting large datasets into smaller chunks is beneficial for processing with GPT-3.
The author encourages readers to validate the model to ensure that the responses are derived from the provided data.
They recommend watching a video recording provided in the article for a comprehensive understanding of the process.
The author advocates for the use of environment variables to securely handle API keys.

Import Required Packages

As we are using Python, let’s go ahead and import the required packages.

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI, VectorDBQA
from langchain.document_loaders import DirectoryLoader
import magic
import os
import nltk

If you do not have above packages installed on your machine, then please go ahead and install these packages before importing.

nltk.download(‘averaged_perceptron_tagger’)
pip install langchain 
pip install openai
pip install chromadb 
pip install unstructured 
pip install beautifulsoup4
pip install python-magic-bin

Once required packages are imported, we need to get OpenAI API key.

Load Input Data

In order to load our text files, we need to instantiate DirectoryLoader and that can be done as shown below:

loader = DirectoryLoader(‘Store’, glob=’**/*.txt’)
docs = loader.load()

In above code, glob needs to be mentioned so that it will pick only the text files. This is particularly useful, when your input directory contains mix of different-different types of files.

Create Vector Store

Next, we need to create embeddings of it, which means we need to turn our data into a vector space. Let’s do this by instantiating OpenAIEmbeddings object as shown below:

openAI_embeddings = OpenAIEmbeddings(openai_api_key=os.environ[‘OPENAI_API_KEY’])
vStore = Chroma.from_documents(doc_texts, openAI_embeddings)

Validate Model

We have created our model and received the response. But how can we make sure that this response is from our data only. To get this assurity, we need to validate our model. You can find those validation lines in my video mentioned below.

If you find anything, which is not clear, I would recommend you to watch my video recording, which demonstrates this flow from end-to-end.

Use Your Locally Stored Files To Get Response From GPT like ChatGPT | Python

Input files

Import Required Packages

Get OpenAI API Key

Load Input Data

Split Data

Create Vector Store

Create Model

Join Medium with my referral link - Shweta Lodha

Read every story on Medium by joining membership of $5/month Your membership fee directly supports me and other writers…

Test Model

Validate Model