Summary

The webpage provides instructions for installing and running the text-generation-webui open-source project on an AWS EC2 instance with GPU support, detailing setup scripts, SSH connection, and troubleshooting steps for hosting and interacting with language models.

Abstract

The article outlines the process of setting up the text-generation-webui on an AWS EC2 instance equipped with a GPU. It begins by referencing a previous post that guides users through creating an EC2 instance with a GPU card. The text-generation-webui is highlighted for its user-friendly interface for deploying large language models (LLMs) for text generation and inference, with the capability to train models as a secondary function. The installation process involves connecting to the EC2 instance via SSH, executing a one-time setup script to install the webui, and optionally setting up a custom Ubuntu service. The article also addresses potential installation issues, provides a one-step install script, and offers a step-by-step manual installation guide. Upon successful installation, the server startup outputs a local link and an OpenAI-compatible API URL for accessing the webui. The article concludes with tips for using a simple localhost URL via an SSH tunnel, checking the Ubuntu firewall status, and monitoring disk space usage.

Opinions

The author suggests that assigning a static IP address to an EC2 instance is optional but recommends a workaround for occasional access if a static IP is not used.
It is implied that the one-step install script is the preferred method for installing text-generation-webui, but a manual step-by-step guide is provided for those who prefer more control over the installation process.
The author emphasizes the importance of deleting the installer_files folder if the installation encounters issues to ensure subsequent installations work correctly.
The article assumes familiarity with SSH, Linux commands, and basic server configuration, indicating that the target audience is somewhat tech-savvy.
By providing a public URL for accessing the text-generation-webui, the author demonstrates the ease of sharing the interface with others without requiring complex network configurations.

Installing text-generation-webui to host LLMs for chat or for training [part 3 of a series]

In the previous post in this series, we looked at how to create an EC2 instance on AWS cloud with a GPU card.

The open source project text-generation-webui [https://github.com/oobabooga/text-generation-webui] provides an easy to use user interface for downloading, hosting and using LLMs for inference (text prediction/generation). It can also be used for training, although that is not its main function.

A basic user interface to chat with an LLM powered AI. The interface is provided by the open source project text-generation-webui and includes interfaces for loading models and training.

ref = https://repost.aws/articles/ARQ0Tz9eorSL6EAus7XPMG-Q/how-to-install-textgen-webui-on-aws

Installing and running the text-generation-webui at Launch, via a custom Ubuntu service.

First, you need to connect to the EC2 instance via SSH.

By default, an EC2 instance does not have a static IP address. You can of course assign a static IP address, but that has a running cost.

In case you only occasionally need to connect to the EC2 instance, see this post that provides a workaround.

Now that you are connected to the EC2 instance, you can run a script to install the text-generation-webui.

note: if something goes wrong with the install, you need to delete the installer_files folder — otherwise, following re-installations will not work.

One-time setup scripts:

One-step simple install of text-generation-webui with GPU. Also downloads a small LLM (mistral 7B).

curl https://gist.githubusercontent.com/mrseanryan/70b09d405e77d0881d01ca288d2476cc/raw | bash

The source code can be seen here:

https://gist.github.com/mrseanryan/70b09d405e77d0881d01ca288d2476cc

Alternative approach:

Step-by-step manual Install of text-generation-webui with GPU support

https://gist.github.com/mrseanryan/1f6ee66fde867ac37e9c57da2d7a3f09

In the SSH console output, you should see the text-generation-webui HTTP server start up.

It will output a local link like http://127.0.0.1/7860 and also an OpenAI-compatible API URL, like Running on public URL: https://91234123123123.gradio.live

Hold CTRL key and left click on the public URL.

The text-generation-webui web page should load in your default browser.

tip: use a ssh tunnel for a simple localhost URL

To use a simple localhost URL in your browser, like http://localhost:7860 you can use a ssh tunnel by running this command:

ssh -i path-to-my.pem -N -L 7860:localhost:7860 ubuntu@<public IP address of EC2 instance>

Now, the local link should work from your browser: http://localhost:7860/

tip: check the Ubuntu firewall

If you have difficulties connecting, then you can check if the Ubuntu firewall is active:

sudo ufw status

If the firewall is active, you can configure Ubuntu firewall top open the necessary ports:

sudo ufw allow 5000/tcp
sudo ufw allow 7860/tcp

For more details about the Ubuntu firewall, see https://www.cyberciti.biz/faq/how-to-configure-firewall-with-ufw-on-ubuntu-20-04-lts/#Open_ports_with_ufw

tip: checking disk space

The LLMs and the Python libraries can consume a lot of disk space.

To check disk space from the Ubuntu terminal:

df -h

To see what folders are taking up the most space:

sudo du -cha - max-depth=1 / | grep -E "M|G"