avatarSean Ryan

Summarize

Installing text-generation-webui to host LLMs for chat or for training [part 3 of a series]

In the previous post in this series, we looked at how to create an EC2 instance on AWS cloud with a GPU card.

The open source project text-generation-webui [https://github.com/oobabooga/text-generation-webui] provides an easy to use user interface for downloading, hosting and using LLMs for inference (text prediction/generation). It can also be used for training, although that is not its main function.

A basic user interface to chat with an LLM powered AI. The interface is provided by the open source project text-generation-webui and includes interfaces for loading models and training.

ref = https://repost.aws/articles/ARQ0Tz9eorSL6EAus7XPMG-Q/how-to-install-textgen-webui-on-aws

Installing and running the text-generation-webui at Launch, via a custom Ubuntu service.

First, you need to connect to the EC2 instance via SSH.

By default, an EC2 instance does not have a static IP address. You can of course assign a static IP address, but that has a running cost.

In case you only occasionally need to connect to the EC2 instance, see this post that provides a workaround.

Now that you are connected to the EC2 instance, you can run a script to install the text-generation-webui.

note: if something goes wrong with the install, you need to delete the installer_files folder — otherwise, following re-installations will not work.

One-time setup scripts:

  • One-step simple install of text-generation-webui with GPU. Also downloads a small LLM (mistral 7B).
curl https://gist.githubusercontent.com/mrseanryan/70b09d405e77d0881d01ca288d2476cc/raw | bash

The source code can be seen here:

https://gist.github.com/mrseanryan/70b09d405e77d0881d01ca288d2476cc

Alternative approach:

  • Step-by-step manual Install of text-generation-webui with GPU support

https://gist.github.com/mrseanryan/1f6ee66fde867ac37e9c57da2d7a3f09

In the SSH console output, you should see the text-generation-webui HTTP server start up.

It will output a local link like http://127.0.0.1/7860 and also an OpenAI-compatible API URL, like Running on public URL: https://91234123123123.gradio.live

Hold CTRL key and left click on the public URL.

The text-generation-webui web page should load in your default browser.

tip: use a ssh tunnel for a simple localhost URL

To use a simple localhost URL in your browser, like http://localhost:7860 you can use a ssh tunnel by running this command:

ssh -i path-to-my.pem -N -L 7860:localhost:7860 ubuntu@<public IP address of EC2 instance>

Now, the local link should work from your browser: http://localhost:7860/

tip: check the Ubuntu firewall

If you have difficulties connecting, then you can check if the Ubuntu firewall is active:

sudo ufw status

If the firewall is active, you can configure Ubuntu firewall top open the necessary ports:

sudo ufw allow 5000/tcp
sudo ufw allow 7860/tcp

For more details about the Ubuntu firewall, see https://www.cyberciti.biz/faq/how-to-configure-firewall-with-ufw-on-ubuntu-20-04-lts/#Open_ports_with_ufw

tip: checking disk space

The LLMs and the Python libraries can consume a lot of disk space.

To check disk space from the Ubuntu terminal:

df -h

To see what folders are taking up the most space:

sudo du -cha - max-depth=1 / | grep -E "M|G"

If you enjoyed this article, consider trying out the AI service I recommend. It provides the same performance and functions to ChatGPT Plus(GPT-4) but more cost-effective, at just $6/month (Special offer for $1/month). Click here to try ZAI.chat.

Llm
AWS
Llmops
Machine Learning
Genai
Recommended from ReadMedium