The article discusses the integration of MLFlow and BentoML to streamline the machine learning lifecycle, enabling rapid deployment of the best-performing models.
Abstract
The article outlines an efficient approach to machine learning operations (MLOps) by combining MLFlow and BentoML. It emphasizes the challenge of prolonged model deployment times, with 64% of organizations taking over a month to deploy models, and proposes a solution to reduce this time significantly. MLFlow is utilized for experimentation and model selection, while BentoML facilitates model serving and deployment. The integration of these tools allows for a historical view of the training and deployment process, ensuring reproducibility and scalability of machine learning models from development to production. The article guides readers through setting up MLFlow and BentoML, conducting hyperparameter tuning with MLFlow, and deploying the selected model using BentoML, culminating in a practical example of predicting house prices.
Opinions
The author believes that reducing the time to deploy machine learning models empowers data scientists to focus on feature development, leading to more accurate models.
MLFlow is praised for its ease of use in experiment tracking and model comparison, requiring minimal additional code.
BentoML is highlighted for its ability to containerize models and manage deployable artifacts, simplifying the inference process and encouraging reproducible models across environments.
The author values the visualization of the training and deployment process provided by the BentoML registry, which enhances the understanding of the machine learning lifecycle.
The article suggests that the combination of MLFlow and BentoML can make data teams more effective by enabling faster deployment of machine learning models.
Deploying the best model in a few minutes with BentoML and MLFlow
Simplifying your MLOps stack with MLFlow and BentoML
Developing machine learning models takes time. In Algorithimia’s 2021 Enterprise Trends In Machine Learning report, they observed that 64% of these organizations took at least a month to deploy their models.¹ Shrinking time to at least 50% will empower data scientists to make space for them, so they can create cleaner, better features for their models. Better features lead to a higher accuracy model.
Yet, figuring out what tools to use for deployment is a problem too. So many tools can do different things and the space itself is quite a mess. Fortunately for us, we can install MLFlow and BentoML to cover the machine learning operations (MLOps) tasks.
MLFlow is great for experimentation with different models and training e.g. hyperparameter tuning that helps identify the best model. Adding BentoML will enable model serving and deployment in production by:
Importing the best model from MLFlow registry
Building an API service with BentoML
Creating Bento and containerizing for deployment
Adding BentoML to the MLFlow pipeline results in a historical view of your training and deploying process. While the model trains, MLFlow saves the runs and their parameters in MLFlow’s registry. This registry is designed for model comparison and selection in an experimentation workflow. BentoML saves this training context in the BentoML registry for future reference. The BentoML registry manages deployable artifacts (Bentos) and simplifies the model inference process. MLFlow runs natively on a BentoML’s runner, so you can take advantage of BentoML’s features like input validation, adaptive batching, and parallelism. The BentoML registry encourages reproducible models across development, testing, and production environments. We can now visualize the training and deployment process from the Bento registry across the machine learning lifecycle.
In this article, you will learn how to:
Use MLFlow to run experiments and find the best model
Use BentoML to deploy the best model in a few seconds
Setup
MLflow is an open-source framework for managing the end-to-end machine learning lifecycle by logging parameters, code versions, metrics, and output files. The best part? It takes one extra line of code, allowing data scientists to get an accurate model.
To install the library with the specific version I used:
pip install mlflow==1.26.1
To set up a testable MLFlow UI:
mlflow ui
BentoML is a Python, open-source framework that allows us to quickly deploy and serve machine learning models at scale.
pip install bentoml==1.0.0
To understand how BentoML and MLFlow work, we will train a model that predicts house prices based on their characteristics.
Fortunately for us, most of the data is clean, but we still need to filter null values when prediction requests come in. While training, we added a df.dropna(). Because we need to preprocess the data the same from training to predicting, we will need to add this step in a callable function.
There are two ways in which we could drop null values. The first would be adding the df.dropna() in the actual service endpoint. We could easily do that in this instance since df.dropna() is not heavyweight.
However, the other way allows me to scale the transformation and the model separately. This can be useful when I need to scale heavyweight transformations and all I needed to do was to add another Runnable easily with the @bentoml.Runnable.method decorator.
Hyperparameter tuning with MLFlow
After setting up the MLFlow UI, we can train the model and tune its hyperparameters.
To record each experiment, all we have to add is one line to our code before running any experiments:
mlflow.sklearn.autolog()
Every MLFlow run will now automatically record data and save it in the local metric repository. The UI can read this data from the repository.
We will now predict the housing prices with these features:
the number of bedrooms
the number of bathrooms
the area of the living room and lot
the number of floors
a waterfront view
an index from 0 to 4 of how good the view of the property was
condition of the house
level of construction and design (grade),
the area of the interior housing space that is above ground level and below ground level
the year it was built and renovated,
location (zipcode, latitude, longitude)
the square footage of interior housing living space and lot for the nearest 15 neighbors
to get the best model and the lowest squared_error.
After running all experiments, the MLFlow UI will pull the metrics from the local repository that it set up and organize them neatly via http://127.0.0.1:5000/. This is what you will see when you set up your UI:
Image by author
You can see the extent of all the characteristics listed in the UI.
Image by author
We have trained the model and can now save the id of the best model to the BentoML repository.
Predictions can be done from a file or sent in data.
Sending in a file path is convenient for testing. The decorator @service.api declares that the function predict is an API, whose input is a file_path string and the output returns a JSON with the predictions. The File class makes sure that the input, the file_path, is actually a string.
In the Swagger UI, we can see an example value.
Image by Author
Now, I implement a sample value of data/test/X_test.csv, resulting in a wonderful list of predictions.
Image by Author
However, sending in a file path is not practical in production. The decorator @service.api declares that the function predict is an API, whose input is a list of data and the output returns a JSON with the predictions.
In the Swagger UI, let’s send the following example to the API.
In this article, I found the best model using MLFlow to predict house prices. This best model was then deployed using BentoML in mere seconds. BentoML and MLFlow help to deploy machine learning models faster and make the data team more effective.