Automating Machine Learning Using FLAML
Using FLAML for Automating Machine Learning Process
Machine Learning is a process where we try to solve real-life business problems using a different set of algorithms. Creating a Machine Learning model is easy but selecting which model performs the best for our data in terms of generalization and performance is a difficult task.
There is a wide variety of Machine Learning algorithms for both Regression and Classification. These can be selected on the basis of what kind of problem we are trying to solve but it is a process that takes high computational cost, time, and effort. There are different Python libraries that provide an option to automate the process of selecting the best Machine Learning model automatically and efficiently, one such library is FLAML.
FLAML is a lightweight open-source Python library that helps in finding out the best Machine Learning model automatically, efficiently, and economically. It is fast that saves time and is also lightweight in design.
In this article, we will explore FLAML and its functionalities.
Let’s get started…
Installing required libraries
We will start by installing FLAML using pip installation. The command given below will install FLAML using pip.
pip install flamlImporting required libraries
In this step, we will import all the libraries that are required for creating a Machine Learning model and downloading the dataset.
from flaml import AutoMLSolving Classification Problem
Now we will start by solving a classification problem. The data that we will be using here is the famous Iris dataset that can be easily loaded from the Seaborn library. Let’s start creating the model.
#Loading the Dataset
from sklearn.datasets import load_irisCreating an instance for Automl is important and also defining the Automl settings, so in this step, we will also create the Automl instance and define the settings.
automl = AutoML()automl_settings = {
"time_budget": 10, # in seconds
"metric": 'accuracy',
"task": 'classification'
}Next, we will split the load of the data and fit it into the model. Finally, we will also predict using the model and find the best model.
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
**automl_settings)print(automl.predict_proba(X_train).shape)
# Export the best model
print(automl.model)
Here, we can clearly see that the Extra Tree Estimator is the best model for this data. Now let us print the best hyperparameters and accuracy of the model.
print('Best ML leaner:', automl.best_estimator)
print('Best hyperparmeter config:', automl.best_config)
print('Best accuracy on validation data: {0:.4g}'.format(1-automl.best_loss))
print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))
Similarly, we will follow the same process for the Regression problem also.
Solving Regression Problem
Now we will solve a regression problem. The data that we will be using here is the famous Boston Dataset that can be easily loaded from the Seaborn library. We can follow the exact same process as we did for the Classification problem.
from sklearn.datasets import load_boston
automl = AutoML()
automl_settings = {
"time_budget": 10, # in seconds
"metric": 'r2',
"task": 'regression'
}
X_train, y_train = load_boston(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
**automl_settings)
# Predict
print(automl.predict(X_train).shape)
# Export the best model
print(automl.model)
print('Best ML leaner:', automl.best_estimator)
print('Best hyperparmeter config:', automl.best_config)
print('Best accuracy on validation data: {0:.4g}'.format(1-automl.best_loss))
print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))
Here we can clearly see the best models and hyperparameters for the Regression problems also.
Similarly, you can follow this process for your dataset and find the best models and hyperparameters for your problem. Try this with different datasets, and let me know your comments in the response section.
This article is in collaboration with Piyush Ingale.
Before You Go
Thanks for reading! If you want to get in touch with me, feel free to reach me at [email protected] or my LinkedIn Profile. You can view my Github profile for different data science projects and packages tutorials. Also, feel free to explore my profile and read different articles I have written related to Data Science.






