Using text-generation-webui to download and run an LLM [part 6 of a series]
In the previous post in this series, we looked at using text-generation-webui to chat with the downloaded LLM model.
To use other LLM models, first use your browser to navigate to Hugging Face, to find an LLM model to download.
A good starting point is the small but high quality Mixtral-8x7B LLM: https://huggingface.co/mistralai/Mixtral-8x7B-v0.1
On the Hugging Face model page, there is a model name like username/model.
Select the little copy icon, to copy the name to the clipboard:
In the gradio user interface (which is provided by your install of text-generation-webui):
- Click on the Model tab at the top of the page.
- Paste the model name into the input box, in the section Download model or LoRA
3. Select Download
A progress bar should appear, showing the download:
4. At top-left, select the little blue refresh icon (circled)
5. Select the dropdown. Select the model, in this case mistralai_Mixtral-8x7B-v0.1
6. Select Load. Wait for the model to load.
7. Select Parameters tab at top.
8. Set auto_max_new_tokens to be on. This is to prevent truncation.
9. Adjust temperature to suit (0 = very conservative, repetitive, 1 = 1 more random, creative).
10. Select the Chat tab at the top-left
11. Type in a prompt for the AI
for example:
how will AI affect the work life of a software developer ?
The AI should generate a text response!
So now we have our own LLM instance running in the Amazon cloud.
Congratulations! — You have completed the series of articles.
Hopefully this has been some use to anyone interested in hosting their own LLMs and exploring the possibilities. Good luck in your future learning!
The full series of articles “Using text-generation-webui to chat with the downloaded LLM model”: