avatarAditya Bhattacharya

Summary

This article discusses the concept of Deep Hybrid Learning, a fusion of conventional Machine Learning with state-of-the-art Deep Learning, and its application in solving an image classification problem using the CheXpert dataset.

Abstract

The article begins by highlighting the importance of Deep Learning in unstructured data analysis and its ability to auto-unwrap features from data. It then introduces the concept of Deep Hybrid Learning, which combines Deep Learning and Machine Learning to leverage the benefits of both approaches. The author uses a Deep Hybrid Network (DHN) model to solve an image classification problem for the CheXpert dataset, comparing its performance to a standalone DNN model. The DHN model uses a DNN layer for feature extraction and a Machine Learning classification layer, with experiments conducted using AdaBoost, XGBoost, and Random Forest algorithms. The results show that the DHN-AdaBoost variant performed the best, with better AUC metrics and model training time than the standalone DNN model. The article concludes by encouraging readers to try implementing the DHN model themselves and provides a link to the code walkthrough.

Bullet points

  • Deep Learning plays a vital role in unstructured data analysis by auto-wrapping features from data.
  • Deep Hybrid Learning combines Deep Learning and Machine Learning to leverage the benefits of both approaches.
  • The article uses a Deep Hybrid Network (DHN) model to solve an image classification problem for the CheXpert dataset.
  • The DHN model uses a DNN layer for feature extraction and a Machine Learning classification layer.
  • Experiments were conducted using AdaBoost, XGBoost, and Random Forest algorithms.
  • The DHN-AdaBoost variant performed the best, with better AUC metrics and model training time than the standalone DNN model.
  • The article encourages readers to try implementing the DHN model themselves and provides a link to the code walkthrough.

Deep Hybrid Learning — a fusion of conventional ML with state of the art DL

A fusion approach to combine Machine Learning with Deep Learning

Image source: Pixabay

Considering state-of-the-art methods for unstructured data analysis, Deep Learning has been known to play an extremely vital role in coming up sophisticated algorithms and model architectures, to auto-unwrap features from the unstructured data and in providing a more realistic solution to real world problems. Before we even begin the discussion, if you want to take a glance at session recording related to Deep Hybrid Learning which I had presented at GIDS AI/ML 2020 conference please take a look:

Now, one of the main reasons why Deep Learning gained popularity is because its removes the need of doing manual feature engineering on the unstructured data, which is extremely difficult and which almost all classical machine learning algorithms are dependent on. In classical machine learning methods, understanding the dataset and the ability of doing feature engineering on the dataset has resulted in the final performance and accuracy of the algorithm. On the other hand, final classification or clustering layer of a Deep Learning model driven by fully connected neural network layers may result to over-fitting when fed with “less” data or even most of the times, these models require unnecessary usage of computational power and resources, which is not there in classical machine learning algorithms.

So, what if, we try to take the benefits of both of these approaches and fuse them together?

This can be achieved by Deep Hybrid Learning, which is the resultant fusion network, which can be achieved by combining Deep Learning and Machine Learning. In this article, we will learn how to use Deep Hybrid Learning, in which we will use Deep Learning methods to generate or extract features from unstructured data and use classical Machine Learning approaches to build highly accurate classification models using the unstructured data. Thus, using Deep Hybrid Learning (DHL) — we can take the benefits from both DL and ML and alleviate the drawbacks of both the techniques and provide more accurate and less computationally expensive solutions.

We will try to apply DHL to solve an image classification problem for the CheXpert dataset publicly made available by the Stanford University ML group. Briefly the project was about automating radio-graphic image analysis to diagnose various pathologies. In other words, I was trying to classify different disease conditions from chest xray images. The code walkthrough is available in my personal website.

Chest Xray image from CheXpert dataset

Now from our approach, we will try to check two things primarily :

1. Check the accuracy of DHL with standalone DNN or ML model

2. Check computational time of the algorithm with the DNN model

But let’s discuss about the model architecture of a simple DHL, which is as follows:

Simple Deep Hybrid Network Model Architecture (Image By Author)

From the figure above we see a simple Deep Hybrid Network model architecture. Here, the DNN layer has only 4 layers and followed by the ML classification layer, where I have experimented with ML classification algorithms like AdaBoost, XGBoost and Random Forest. For more complicated problem solving and better performance of the model I would recommend going with more complex DNN layer but for this article our purpose is not to build a super accurate model but to study the improvement of DHL over ML and DL.

Now, the simple standalone DNN model architecture looks like this :

Simple DNN model architecture, which will be the base of the comparison with out DHN (image by author)

If we carefully observe, the DNN layer is actually same in both the cases. The code walkthrough is available in my personal website. Now, when we train both of these models on roughly 10,000 images and check the model performance on 2000 images, the following are the confusion matrix and AUC score which is observed for both the approaches, considering the number of epochs for training the model and generating the features in both the cases are same (which is 60 epochs) and the other hyper parameter tuning is consistent in both the cases:

  1. For Standalone DNN model
Model Evaluation Metrics for standalone DNN model (image by author)

The results obtained on the 2000 test images using the model evaluation metrics of Accuracy, Precision, Recall, F1 Score, AUC Score and the confusion metrics looks really promising, but we will mainly compare both of these approaches with AUC Score and Confusion Matrix results.

The overall training time for 60 epochs, in my system took almost 45 minutes for the training data.

2. For DHN-AdaBoost model

Model Evaluation Metrics for DHN-AdaBoost model (image by author)

We do see better results in this! In increase in almost all the parameters by almost 6–7%, which is actually a significant improvement in performance.

Now, do you wonder how much time it took for the training process to complete on the same machine? It took around 33 minutes! So, now we see that this approach is slightly faster!

What if we try another variant of DHN?

3. For DHN-XGBoost Model

Model Evaluation Metrics for DHN-XGBoost model (image by author)

Again we see some interesting results. This variant, although performed slightly better than the standalone DNN model, but somehow the training time for this approach was almost 48 minutes, which is more than what we saw for DHN-AdaBoost and the standalone DNN model.

Now, what if we try a non boosting technique like Random Forest as the final classification layer? Will the model perform better? Let’s find out!

4. For DHN-Random Forest Model

Model Evaluation Metrics for DHN-RF model (image by author)

Unfortunately, in this case, the results are quite poor as compared to the other variants. But the training time was quite less! In the same training system with the same configurations, the entire training process was over within 30 minutes! The code walkthrough is available in my personal website.

So can we rule out this variant completely?

Well, maybe not, as there might be certain dataset with which this variant of DHN might work well! If you want to take a peak at the code snippets, please visit my blog post article.

Overall, we do see that Deep Hybrid Network actually performed better than standalone DNN model with respect to AUC metrics and model training time. And DHN-AdaBoost variant performed the best. But are you intrigued by this concept and process? Are you interested to implement it yourself and examine the results? If the answer is yes to both the questions, just drop me a note or comment and take a look at the code walkthrough for this article and you can actually try it yourself and leverage the benefits of Deep Hybrid Learning!

Thus, this brings us to the end of this article. In my personal website, I am trying to provide more details on Deep Hybrid Network on other datasets and other uses. Hope I was able to help! Keep following: https://medium.com/@adib0073 and my website: https://www.aditya-bhattacharya.net/

** Update ** If you like this article and want to support me more for my contributions for the community, please take a look at my book “Applied Machine Learning Explainability Techniques” and this is the GitHub repository which contains many hands-on tutorials on various chapters covered in the book: https://github.com/PacktPublishing/Applied-Machine-Learning-Explainability-Techniques. If you like the tutorials presented in the GitHub repository, please do fork and star the repository to show your support for this project! Please show your support by ordering a physical copy or electronic copy of the book.

Deep Hybrid Learning
Aditya Bhattacharya
Towards Data Science
Deep Learning
Radiographic Image Data
Recommended from ReadMedium