avatarYoussef Hosni


What is LLMOps and How to Get Started With It

LLMOps is primarily focused on enhancing operational capabilities and establishing the necessary infrastructure for refining existing foundational models and seamlessly integrating these optimized models into products.

Although LLMOps may not seem groundbreaking to most observers within the MLOps community, it serves as a specialized subset within the broader MLOps domain. A more specific definition can elucidate the intricate requirements involved in fine-tuning and deploying these models effectively.

Foundational models, such as GPT-3 with its massive 175 billion parameters, demand substantial amounts of data and compute resources for training. While fine-tuning these models may not require the same scale of data or computational power, it remains a significant task that necessitates robust infrastructure capable of parallel processing and handling large datasets.

This article delves into essential resources to help initiate your journey into LLMOps, providing valuable insights and guidance for getting started effectively.

LLMOps consists of two parts:

1. Large Language Models

  • LLM-as-a-Service is where a vendor offers the LLM as an API on their infrastructure. This is how primarily closed-source models are delivered.
  • Custom LLM stack is a broader category of tools necessary for fine-tuning and deploying proprietary solutions built on top of open-source models.

2. Prompt Engineering tools: enable in-context learning instead of fine-tuning at lower costs and without using sensitive data.

  • Vector Databases retrieve contextually relevant information for certain prompts.
  • Prompt Execution enables optimizing and improving the model output based on managing prompt templates to building chain-like sequences of relevant prompts.
  • Prompt Logging, Testing, and Analytics … Let’s just say it’s an emerging space that has no categories yet.

Table of Contents:

  1. Building LLM Applications for Production
  2. Awesome LLMOps
  3. LLMOps
  4. Automated Testing for LLMOps

Most insights I share in Medium have previously been shared in my weekly newsletter, To Data & Beyond.

If you want to be up-to-date with the frenetic world of AI while also feeling inspired to take action or, at the very least, to be well-prepared for the future ahead of us, this is for you.

🏝Subscribe below🏝 to become an AI leader among your peers and receive content not present in any other platform, including Medium:

1. Building LLM Applications for Production

It’s easy to build LLMs, but very hard to make something production-ready with them. This article by Chip Huyen covers how to put LLMs into production.

This post consists of three parts:

  • Part 1 discusses the key challenges of productionizing LLM applications and the solutions that I’ve seen.
  • Part 2 discusses how to compose multiple tasks with control flows (e.g. if statement, for loop) and incorporate tools (e.g. SQL executor, bash, web browsers, third-party APIs) for more complex and powerful applications.
  • Part 3 covers some of the promising use cases that I’ve seen companies building on top of LLMs and how to construct them from smaller tasks.

2. Awesome LLMOps

This GitHub repo contains a curated list of the best LLMOps resources and tools for developers. Here is the

1. Model

2. Serving

3. Security

4. LLMOps

5. Search

6. Code AI

7. Training

8. Data

9. Large Scale Deployment

10. Performance

11. AutoML

12. Optimizations

13. Federated ML

14. Awesome Lists

3. LLMOps

In this course, you’ll go through the LLMOps pipeline of pre-processing training data for supervised instruction tuning, and adapt a supervised tuning pipeline to train and deploy a custom LLM.

This is useful in creating an LLM workflow for your specific application. For example, create a question-answer chatbot tailored to answer Python coding questions, which you’ll do in this course.

Through the course, you’ll go through key steps of creating the LLMOps pipeline:

  • Retrieve and transform training data for supervised fine-tuning of an LLM.
  • Version your data and tuned models to track your tuning experiments.
  • Configure an open-source supervised tuning pipeline and then execute that pipeline to train and then deploy a tuned LLM.
  • Output and study safety scores to responsibly monitor and filter your LLM application’s behavior.
  • Try out the tuned and deployed LLM yourself in the classroom!
  • Tools you’ll practice with include BigQuery data warehouse, the open-source Kubeflow Pipelines, and Google Cloud.

4. Automated Testing for LLMOps

In this course, you will learn how to create a continuous integration (CI) workflow to evaluate your LLM applications at every change for faster, safer, and more efficient application development.

When building applications with generative AI, model behavior is less predictable than traditional software. That’s why systematic testing can make an even bigger difference in saving you development time and cost.

Continuous integration, a key part of LLMOps, is the practice of making small changes to software in development and thoroughly testing them to catch issues early when they are easier to fix. With a robust automated testing pipeline, you’ll be able to isolate bugs before they accumulate — when they’re easier and less costly to fix. Automated testing lets your team focus on building new features so that you can iterate and ship products faster.

After completing this course, you will be able to:

  • Write robust LLM evaluations to cover common problems like hallucinations, data drift, and harmful or offensive output.
  • Build a continuous integration (CI) workflow to automatically evaluate every change to your application.
  • Orchestrate your CI workflow to run specific evaluations at different stages of development.

If you like the article and would like to support me, make sure to:

Subscribe to my newsletter To Data & Beyond to get full and early access to my articles:

Are you looking to start a career in data science and AI and do not know how? I offer data science mentoring sessions and long-term career mentoring:

Data Science
Deep Learning
Recommended from ReadMedium