avatarMohitverma

Summary

This article is the first part of a series on MLOps, focusing on setting up Kubeflow Pipeline V2, MLflow, and Seldon Core for local system deployment and basic MLOps workflow understanding.

Abstract

The initial segment of a comprehensive four-part MLOps series, this article introduces the fundamental concepts of MLOps and guides readers through the installation process of Kubeflow Pipeline V2, MLflow, and Seldon Core on a local system. The author, a telecom cloud engineer with a passion for learning new technologies, aims to elucidate the basics of MLOps without delving into full production-grade deployment practices. The tutorial employs PyTorch for model creation and utilizes a CNN model architecture inspired by CNN Explainer's tinyVGG. Although the dataset is small and the results are not optimal, the focus is on understanding the MLOps pipeline, which includes training, testing, and deploying models. The article also provides insights into the functionalities of Kubeflow, MLflow, and Seldon Core, emphasizing their roles in streamlining the machine learning lifecycle, managing model development, deployment, and serving ML models as production REST/GRPC microservices.

Opinions

  • The author is not an MLOps engineer but approaches the topic with enthusiasm and a hobbyist's curiosity, indicating that the tutorial is tailored for beginners or those new to MLOps.
  • The article suggests that MLflow is a unified platform suitable for both individual researchers and large teams, highlighting its utility in managing the complexities of model development, deployment, and management.
  • The use of Kubeflow Pipeline V2 is recommended for its ability to facilitate the orchestration of Kubernetes ML workloads, ensuring scalability and portability.
  • Seldon Core is presented as a versatile tool for converting ML models into production REST/GRPC microservices, simplifying the deployment process.
  • The author emphasizes the importance of creating a base Docker image to reduce image size, thereby improving the efficiency of the pipeline runtime.
  • The tutorial advocates for the use of Containerized Python Components within KFP pipelines, showcasing their effectiveness as building blocks for creating reproducible and scalable ML workflows.

MLOps with Kubeflow-pipeline V2, mlflow, Seldon Core : Part1

This is first part of the four parts MLOps series.

Part 1: Introduction to the basic concepts and installation on local system.

Part 2: Understanding the kubeflow pipeline and components.

Part 3: Understanding the Mlflow server UI for logging parameters, code versions, metrics, and output files.

Part 4: Deploying model with Seldon core server over kubernetes.

MLOps Pipeline overview
Overview of Kubeflow pipeline successful run

MLOps is a very trending topic among Machine Learning engineers and data scientists. It is basically setting up a workflow for training, testing and making the model available to production.

I am not a MLOps engineer but a telecom cloud engineer and learning new tech is my hobby, this tutorial will only explain the basics of MLOPS and may not provide full production grade deployment practices. In this example I have created/trained model using PyTorch framework and deployed CNN model architecture copying tinyVGG from CNN Explainer. The results of this training and inference are not great as the dataset used to train is really small.

PyTorch: PyTorch is an optimized Deep Learning tensor library based on Python and Torch and is mainly used for applications using GPUs and CPUs. PyTorch is favored over other Deep Learning frameworks like TensorFlow and Keras since it uses dynamic computation graphs and is completely Pythonic.

To enjoy the 25 hours pytorch tutorial by freecodecamp on youtube please click here. Highly recommend this course to the beginners like me. Some basic knowledge of machine learning is needed for this tutorial.

Kubeflow: Kubeflow is a community and ecosystem of open-source projects to address each stage in the machine learning (ML) lifecycle. It makes ML on Kubernetes simple, portable, and scalable. The goal of Kubeflow is to facilitate the orchestration of Kubernetes ML workloads and to empower users to deploy best-in-class open-source tools on any Cloud infrastructure. In this tutorial kubeflow pipeline v2.2 part of kubeflow will be used.

MLflow: Whether you’re an individual researcher, a member of a large team, or somewhere in between, MLflow provides a unified platform to navigate the intricate maze of model development, deployment, and management. MLflow aims to enable innovation in ML solution development by streamlining otherwise cumbersome logging, organization, and lineage concerns that are unique to model development. This focus allows you to ensure that your ML projects are robust, transparent, and ready for real-world challengese.

Seldon: Seldon core converts your ML models (Tensorflow, Pytorch, H2o, etc.) or language wrappers (Python, Java, etc.) into production REST/GRPC microservices.

Setting up the local environment.

  1. Create kubernetes cluster with kind.
# Create the kind cluster 

➜  ~ kind version
kind v0.21.0 go1.21.6 darwin/arm64

➜  ~ kind create cluster --name kubeflow

➜  ~ kubectl version
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.1

2. Install the kubeflow components and access the kubeflow UI and minio UI.

➜  ~ kubectl apply -k 'github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic?timeout=120&ref=2.2.0'

➜  ~ kubectl get pods -n kubeflow -
NAME                                                       READY   STATUS      RESTARTS      AGE
cache-deployer-deployment-cf9646b9c-jt5n5                  1/1     Running     0             44d
cache-server-56d4959c9-m9h8p                               1/1     Running     0             44d
metadata-envoy-deployment-9c7db86d8-9kkc8                  1/1     Running     0             44d
metadata-grpc-deployment-d94cc8676-n96ds                   1/1     Running     1 (44d ago)   44d
metadata-writer-cd5dd8f7-qnm2q                             1/1     Running     3 (22h ago)   44d
minio-5dc6ff5b96-2c8g8                                     1/1     Running     0             44d
ml-pipeline-64d6db5897-jr5jw                               1/1     Running     0             44d
ml-pipeline-persistenceagent-77947c888d-7drjn              1/1     Running     0             44d
ml-pipeline-scheduledworkflow-676478b778-6t96t             1/1     Running     0             44d
ml-pipeline-ui-87b9d4fb6-6hts2                             1/1     Running     2 (22h ago)   44d
ml-pipeline-viewer-crd-8574556b89-82t9d                    1/1     Running     0             44d
ml-pipeline-visualizationserver-5d7c54f495-zzqhr           1/1     Running     0             44d
my-mlflow-6689ff755d-5b7fp                                 1/1     Running     0             5h49m
mysql-5b446b5744-gtpt6                                     1/1     Running     0             44d
workflow-controller-66d557786-sxkll                        1/1     Running     1 (22h ago)   35

➜  ~ kubectl port-forward svc/ml-pipeline-ui 8002:80 -n kubeflow
Kubeflow pipeline v2.2 UI

=> Access the minio S3 storage UI.

➜  ~ kubectl port-forward svc/minio-service 9000:8081 -n kubeflow
MinIO S3 storage installed with kubeflow

3. Install the Mlflow server with helm .

➜  ~ helm repo add community-charts https://community-charts.github.io/helm-charts

# edit the S3 minio url and the credentials in the values.yaml. 

extraEnvVars:
    AWS_ACCESS_KEY_ID: 
    AWS_SECRET_ACCESS_KEY: 
artifactRoot:
s3:
    # -- Specifies if you want to use AWS S3 Mlflow Artifact Root
    enabled: true
    # -- S3 bucket name
    bucket: "modeloutput" # required

➜  ~ helm install my-mlflow community-charts/mlflow --version 0.7.19 -f mlflow-values.yaml -n kubeflow

➜  ~ kubectl get pods -n kubeflow | grep mlflow
my-mlflow-6689ff755d-5b7fp                                 1/1     Running     0             5h57m

➜  ~ kubectl port-forward svc/my-mlflow 8004:9000

=> Access the mlflow server UI.

mlflow UI default view

4. Install the seldon-core on k8s cluster using helm.

➜  ~ kubectl create namespace seldon-system

➜  ~ helm install seldon-core seldon-core-operator --repo https://storage.googleapis.com/seldon-charts --set usageMetrics.enabled=true --set istio.enabled=false --namespace seldon-system

➜  ~ kubectl get pods -n seldon-system
NAME                                         READY   STATUS    RESTARTS      AGE
seldon-controller-manager-65f8dbf9bc-h9fdb   1/1     Running   1 (23h ago)   3d8h

5. Create the conda virtual environment and install the dependencies for creating the environment .

python                    3.10.13
kfp                       2.7.0 
kfp-kubernetes            1.2.0 
kfp-pipeline-spec         0.3.0 
kfp-server-api            2.0.5 
typing-extensions         4.9.0    
typing_extensions         4.9.0    
jupyter                   1.0.0

6. Creating a image to use that image as base image for the components creation in the kfp pipeline. KFP’s Components are explained later.

The need of creating this base image is to reduce the size of the image for quicker download and install during pipeline runtime.

➜  ~ cat requirements.txt
torch
kfp-kubernetes
pathlib
boto3
mlflow
requests
pillow
numpy
typing

➜  ~ cat Dockerfile
FROM python:3.10-slim
RUN apt-get update \
    && rm -rf /var/lib/apt/lists/*
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cpu

➜  ~ docker image build -t mohitverma1688/model_train_component .
➜  ~ docker image ls 
mohitverma1688/model_train_component       v0.1        81bbc346d378   4 weeks ago    782MB

Creating KFP component directory structure and building component Docker image.

Components are the building blocks of KFP pipelines. A component is a remote function definition; it specifies inputs, has user-defined logic in its body, and can create outputs. When the component template is instantiated with input parameters, we call it a task. For this pipeline I have used “Containerized Python Components”.

src
└── components
    ├── data_download
    │   ├── data_download_component.py
    ├── model_eval
    │   ├── model_eval_component.py
    │   └── utils.py
    ├── model_inference
    │   ├── data_setup.py
    │   ├── engine.py
    │   ├── model_builder.py
    │   ├── model_inference.py
    │   ├── model_inference_component.py
    │   └── utils.py
    ├── model_train_cnn
    │   ├── data_setup.py
    │   ├── engine.py
    │   ├── model_builder.py
    │   ├── model_train.py
    │   ├── model_train_component.py
    │   └── utils.py
    └── register_model
        ├── model_builder.py
        ├── register_model_component.py

In a Containerized Python Component, base_image specifies the base image that KFP will use when building your new container image. Specifically, KFP uses the base_image argument for the FROM instruction in the Dockerfile used to build your image.

Now that the code is in a standalone directory as above , we can conveniently build an image using the kfp component build CLI command for each component in the components directory. Refer model_train example below.

%%writefile src/components/model_train_cnn/model_train_component.py
from kfp import dsl
from kfp import compiler
from typing import Dict
from kfp.dsl import Dataset,Output,Artifact,OutputPath,InputPath, Model,HTML

@dsl.component(base_image='mohitverma1688/model_train_component:v0.1',
               target_image='mohitverma1688/model_train_component:v0.24',
               packages_to_install=['pandas','matplotlib']
               )

def model_train(num_epochs:int, 
                batch_size:int, 
                hidden_units:int,
                learning_rate: float,
                train_dir: str,
                test_dir: str,
                model_name: str,
....
Note: For simplicity only a snippet of code is pasted. 

As you can see that I have used previous created docker image as “base image” tag. Now using the kfp component build command , will produce additional artifacts mainly the Dockerfile and the runtime-requirements.txt. You can specify the target_image section to push the docker image to your registry directly.

!kfp component build src/components/model_train_cnn --component-filepattern model_train_component.py 

src/components/model_train_cnn
├── Dockerfile
├── component_metadata
│   └── model_train.yaml
├── data_setup.py
├── engine.py
├── kfp_config.ini
├── model_builder.py
├── model_train.py
├── model_train_component.py
├── runtime-requirements.txt
└── utils.py

➜  ~ cat Dockerfile
# Generated by KFP.

FROM mohitverma1688/model_train_component:v0.2

WORKDIR /usr/local/src/kfp/components
COPY runtime-requirements.txt runtime-requirements.txt
RUN pip install --no-cache-dir -r runtime-requirements.txt

RUN pip install --no-cache-dir kfp==2.7.0
COPY . .

➜  ~ cat runtime-requirements.txt
# Generated by KFP.
matplotlib
pandas%

In the next part I will explain each component of the pipeline in details :)

Mlops
Kubeflow Pipelines
Recommended from ReadMedium
avatarVIKRANT SINGH
LLMOPs vs MLOPS

Introduction:

3 min read