avatarSanchit Tanwar

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

7187

Abstract

keyword">else</span>: categories<span class="hljs-selector-class">.append</span>(<span class="hljs-string">'cat'</span>)

df = pd<span class="hljs-selector-class">.DataFrame</span>({ <span class="hljs-string">'filename'</span>: filenames, <span class="hljs-string">'category'</span>: categories })</pre></div><p id="a624">This code will create a dataframe consisting of filenames and a corresponding label. Now we need to make data generator and we can flow through the data frame. In datagenerator, we will just normalize image between 0 and 1. We will study about image augmentation techniques later.</p><div id="8c03"><pre><span class="hljs-keyword">from</span> keras.preprocessing.image <span class="hljs-keyword">import</span> ImageDataGenerator</pre></div><div id="a2ac"><pre><span class="hljs-comment">#We need to normalize image</span> <span class="hljs-attribute">datagen</span> = ImageDataGenerator(rescale=<span class="hljs-number">1</span>./<span class="hljs-number">255</span>)</pre></div><p id="193c">We will also create a validation loader to validate the model. So we will divide the dataframe into two parts of train and validation.</p><div id="452b"><pre><span class="hljs-keyword">from</span> sklearn.model_selection <span class="hljs-keyword">import</span> train_test_split train_df, test_df = train_test_split(df,test_size = <span class="hljs-number">0.2</span>)</pre></div><p id="a61b">We are good to make train and valid generator.</p><div id="1e0f"><pre><span class="hljs-attr">traingenerator</span> = datagen.flow_from_dataframe( train_df , <span class="hljs-string">'./train'</span>,x_col = <span class="hljs-string">'filename'</span> , y_col = <span class="hljs-string">'category'</span>, target_size = (<span class="hljs-number">224</span>,<span class="hljs-number">224</span>) ,class_mode=<span class="hljs-string">'categorical'</span>, batch_size = <span class="hljs-number">32</span>)</pre></div><div id="3b91"><pre><span class="hljs-attr">testgenerator</span> = datagen.flow_from_dataframe( test_df , <span class="hljs-string">'./train'</span>,x_col = <span class="hljs-string">'filename'</span> , y_col = <span class="hljs-string">'category'</span>, target_size = (<span class="hljs-number">224</span>,<span class="hljs-number">224</span>) ,class_mode=<span class="hljs-string">'categorical'</span>, batch_size = <span class="hljs-number">32</span>)</pre></div><p id="ccf6">Train generator is the method which will generate one batch of data. The arguments of this method are the data frame we created, the path to images, name of columns, final image size, batch size. We also need to specify the categorical mode.</p><p id="ad36">Dataloader will always be in a different way, so practice it with different datasets and try to study the documentation of keras. In case of any doubt please feel free to contact.</p><p id="526b">Now we are done with the dataloader and we will create a model. We will also create a validation loader to validate the model.</p><h1 id="1b23">Model</h1><p id="64a1">The model will be a CNN based model. We will keep it simple in this blog and study later how to improve model and accuracy.</p><div id="bfaa"><pre><span class="hljs-keyword">from</span> keras.models <span class="hljs-keyword">import</span> Sequential <span class="hljs-keyword">from</span> keras.layers <span class="hljs-keyword">import</span> Conv2D, MaxPooling2D, Dropout, <span class="hljs-keyword">Flatten</span>, Dense, Activation, BatchNormalization</pre></div><div id="2cc0"><pre><span class="hljs-attribute">model</span> <span class="hljs-operator">=</span> Sequential()</pre></div><div id="4e73"><pre><span class="hljs-attribute">model</span>.add(Conv2D(<span class="hljs-number">32</span>, (<span class="hljs-number">5</span>,<span class="hljs-number">5</span>), activation='relu', input_shape=(<span class="hljs-number">224</span>, <span class="hljs-number">224</span>, <span class="hljs-number">3</span>))) <span class="hljs-attribute">model</span>.add(MaxPooling2D(pool_size=(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>)))</pre></div><div id="8776"><pre>model.<span class="hljs-built_in">add</span>(Conv2D(64, (5,5), <span class="hljs-attribute">activation</span>=<span class="hljs-string">'relu'</span>)) model.<span class="hljs-built_in">add</span>(MaxPooling2D(pool_size=(2, 2)))</pre></div><div id="2173"><pre>model.<span class="hljs-built_in">add</span>(Conv2D(128, (5,5), <span class="hljs-attribute">activation</span>=<span class="hljs-string">'relu'</span>)) model.<span class="hljs-built_in">add</span>(MaxPooling2D(pool_size=(2, 2)))</pre></div><div id="fd0c"><pre>model.<span class="hljs-built_in">add</span>(Conv2D(256, (5,5), <span class="hljs-attribute">activation</span>=<span class="hljs-string">'relu'</span>)) model.<span class="hljs-built_in">add</span>(MaxPooling2D(pool_size=(2, 2)))</pre></div><div id="1007"><pre>model.<span class="hljs-built_in">add</span>(Conv2D(256, (5,5), <span class="hljs-attribute">activation</span>=<span class="hljs-string">'relu'</span>)) model.<span class="hljs-built_in">add</span>(MaxPooling2D(pool_size=(2, 2)))</pre></div><div id="b2d6"><pre>model.<span class="hljs-built_in">add</span>(Flatten()) model.<span class="hljs-built_in">add</span>(Dense(512, <span class="hljs-attribute">activation</span>=<span class="hljs-string">'relu'</span>)) model.<span class="hljs-built_in">add</span>(Dense(2, <span class="hljs-attribute">activation</span>=<span class="hljs-string">'softmax'</span>)) # 2 because we have cat #<span class="hljs-keyword">and</span> dog classes</pre></div><div id="b809"><pre>model.compile(<span class="hljs-attribute">loss</span>=<span class="hljs-string">'categorical_crossentropy'</span>, <span class="hljs-attribute">optimizer</span>=<span class="hljs-string">'adam'</span>, metrics=[<span class="hljs-string">'accuracy'</span>])</pre></div><div id="2823"><pre><span class="hljs-keyword">model</span>.summary()</pre></div><p id="44c4">I have explained some part of it already. Only Conv2d and Maxpool2d are different here. I will explain their arguments now.</p><p id="fad8">In Conv2d we need to feed the output number of channels which is the first argument. The second argument is kernel size which we have used (5,5) you can use different but we generally keep it odd in size. The reason for keeping the kernel size odd can be found <a href="https://datascience.stackexchange.com/questions/23183/why-convolutions-always-use-odd-numbers-as-filter-size">here</a>. We are familiar with the third argument which is the activation function of that layer.</p><p id="62c5">In Maxpool2d we give only one argument, pool size.</p><p id="4580">Now its time to fit the model.</p><h1 id="6dea">Training</h1><p id="c3cc">Now we have created a model and good to train the model. Training model in keras is very easy and rather just one line of code. Unlike in the last blog where the data was just one numpy array, it is a data generator in this case. Keras gives a different function for this.</p><div id="5e57"><pre><span class="hljs-keyword">history</span> = model.fit_generator<span class="hljs-params">( traingenerator,

Options

<span class="hljs-attr">epochs</span>=20,
<span class="hljs-attr">validation_data</span> = testgenerator,
<span class="hljs-attr">validation_steps</span>= len(testgenerator)</span>,
steps_per_epoch = len<span class="hljs-params">(traingenerator)</span>

)</pre></div><h2 id="fd83">Training loss and accuracy plot</h2><p id="6dcf">The code for plotting these graphs is the same</p><div id="f84a"><pre>import matplotlib<span class="hljs-selector-class">.pyplot</span> as plt <span class="hljs-selector-id">#Loss</span> plt<span class="hljs-selector-class">.plot</span>(history<span class="hljs-selector-class">.history</span><span class="hljs-selector-attr">[<span class="hljs-string">'loss'</span>]</span>) plt<span class="hljs-selector-class">.plot</span>(history<span class="hljs-selector-class">.history</span><span class="hljs-selector-attr">[<span class="hljs-string">'val_loss'</span>]</span>) plt<span class="hljs-selector-class">.title</span>(<span class="hljs-string">'Model loss'</span>) plt<span class="hljs-selector-class">.ylabel</span>(<span class="hljs-string">'Loss'</span>) plt<span class="hljs-selector-class">.xlabel</span>(<span class="hljs-string">'Epoch'</span>) plt<span class="hljs-selector-class">.legend</span>(<span class="hljs-selector-attr">[<span class="hljs-string">'Train'</span>, <span class="hljs-string">'Test'</span>]</span>, loc=<span class="hljs-string">'upper left'</span>) plt<span class="hljs-selector-class">.show</span>()</pre></div><div id="a13a"><pre><span class="hljs-selector-id">#Accuracy</span>

plt<span class="hljs-selector-class">.plot</span>(history<span class="hljs-selector-class">.history</span><span class="hljs-selector-attr">[<span class="hljs-string">'acc'</span>]</span>) plt<span class="hljs-selector-class">.plot</span>(history<span class="hljs-selector-class">.history</span><span class="hljs-selector-attr">[<span class="hljs-string">'val_acc'</span>]</span>) plt<span class="hljs-selector-class">.title</span>(<span class="hljs-string">'Model accuracy'</span>) plt<span class="hljs-selector-class">.ylabel</span>(<span class="hljs-string">'Accuracy'</span>) plt<span class="hljs-selector-class">.xlabel</span>(<span class="hljs-string">'Epoch'</span>) plt<span class="hljs-selector-class">.legend</span>(<span class="hljs-selector-attr">[<span class="hljs-string">'Train'</span>, <span class="hljs-string">'Test'</span>]</span>, loc=<span class="hljs-string">'upper left'</span>) plt<span class="hljs-selector-class">.show</span>()</pre></div><figure id="3101"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*wVdoOAajYSRLaXo8PpzPdg.png"><figcaption></figcaption></figure><figure id="d82a"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*Ji4QKr9LgPil1xuJoMDsOA.png"><figcaption></figcaption></figure><h1 id="fd0e">Testing model</h1><p id="0cf6">Once the model is trained we can test it on some images which I will take from google and we will see how it performs on some real data.</p><div id="2e5c"><pre><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np</pre></div><div id="19cd"><pre><span class="hljs-function">def <span class="hljs-title">img_show</span><span class="hljs-params">(image)</span>: b,g,r =</span> cv2.<span class="hljs-built_in">split</span>(image) image = cv2.<span class="hljs-built_in">merge</span>((r,g,b)) plt.<span class="hljs-built_in">imshow</span>(image) plt.<span class="hljs-built_in">show</span>() <span class="hljs-keyword">return</span> image</pre></div><div id="e150"><pre>def <span class="hljs-built_in">test</span>(model,image_path): img = cv2.<span class="hljs-built_in">imread</span>(image_path) img = <span class="hljs-built_in">img_show</span>(img) img = cv2.<span class="hljs-built_in">resize</span>(img,(<span class="hljs-number">224</span>,<span class="hljs-number">224</span>)) img = np.<span class="hljs-built_in">reshape</span>(img,(<span class="hljs-number">1</span>,<span class="hljs-number">224</span>,<span class="hljs-number">224</span>,<span class="hljs-number">3</span>)) img = img/<span class="hljs-number">255.0</span> prediction = model.<span class="hljs-built_in">predict</span>(img) prediction = np.<span class="hljs-built_in">argmax</span>(prediction) labels = (traingenerator.class_indices) labels = <span class="hljs-built_in">dict</span>((v,k) for k,v in labels.<span class="hljs-built_in">items</span>()) return labels[prediction]</pre></div><p id="d859">Dont worry this might look difficult but it is not. I will tell you code line by line. In img_show function, I have just changed the image from BGR to RGB. Opencv reads the image in BGR format and matplotlib takes the image in RGB format so to show image we first need to convert it into RGB format. Now we will resize the image to the shape we used at the time of training. Now we can predict on a custom image using model.predict function. But we need to feed a batch of the image in this function rather than just a single image. So we reshape our image from 224,224,3 to 1,224,224,3 which means that batch size is 1. We also need to normalize the image that's why we divided image with 255.0. Now prediction will return the softmax output. As we know, softmax will return confidence score and the sum of predictions will be 1.</p><p id="5805">np.argmax will return the index of maximum prediction. Now we need to convert it back to the string label from the encoded label of this. Keras comes to save us here again. Keras internally creates a dictionary to convert labels from string to integer label. We can visualize this dictionary using class_indices function.</p><div id="8de5"><pre><span class="hljs-function"><span class="hljs-title">print</span><span class="hljs-params">(traingenerator.class_indices)</span></span></pre></div><div id="3911"><pre><span class="hljs-meta">#printed output</span> {'cat': <span class="hljs-number">0</span>, 'dog': <span class="hljs-number">1</span>}</pre></div><p id="4b91">We just need to reverse this dictionary to return the string from integer predicted. The line before the return in test function will inverse label dictionary.</p><div id="e494"><pre>labels = (traingenerator.class_indices) labels = <span class="hljs-built_in">dict</span>((v,k) <span class="hljs-keyword">for</span> k,v <span class="hljs-keyword">in</span> labels<span class="hljs-selector-class">.items</span>()) <span class="hljs-function"><span class="hljs-title">print</span><span class="hljs-params">(labels)</span></span> <span class="hljs-selector-id">#Printed</span> output {<span class="hljs-number">0</span>: <span class="hljs-string">'cat'</span>, <span class="hljs-number">1</span>: <span class="hljs-string">'dog'</span>} </pre></div><p id="d12c">At last, we just need to get the value of the predicted integer by the model.</p><p id="b0bd">The model is working fine and you can check it on google colab to run the code.</p><p id="c753">In the next blog, we will learn how to improve our model and get better results. We will compare our results in comparison to this model which will give us better insight.</p><p id="41d8">Peace …</p></article></body>

Making our first CNN based project using Keras

Signup for my live computer vision course: https://bit.ly/cv_coursem

We will be making an image classifier using Keras framework in this article. I have not written about CNN myself but I have mentioned two sources to get This tutorial is part of the deep learning workshop. The link to lessons will be given below as soon as I update them. Github link of this repo is here. Link to the jupyter notebook of this tutorial is here.

Index

  1. Introduction to machine learning and deep learning.
  2. Introduction to neural networks.
  3. Introduction to Python.
  4. Building our first neural network in Keras.
  5. A comprehensive guide to CNN.
  6. Image classification with CNN < — You are here

We have already talked about how to structure deep learning projects and other basics. In this article, we will see how to use a convolutional neural network to make image-based deep learning models. Before jumping into the coding part I wanted to make some points clear if they are not already clear to you. If the idea is already very clear to you, you can directly jump over to the coding part.

Why convolutional neural networks?

So you might be wondering why do we need convolutional neural networks when we have artificial neural networks and they work fine. The main part of computer vision is feature extraction. Let's make this point more clear with an example.

So this is the image of a car and it is easy for us humans to see this image and tell it is a car. On the other hand, the computer sees every image as a matrix of a lot of numbers(1080p image will have 6220800 of such numbers) consisting of three channels for red, blue and green color(Usually). So, before deep learning, the field of computer vision existed. And the feature extraction was done using some hand-coded rules. These hand-coded rules mostly included some predefined filters which we can convolve with the image and get the desired result. For example, we can detect edges of this image using filters such as Sobel filter which looks like this,

Sobel operator

These are two different parts separately for horizontal and vertical edge detection. But this will only do the edge detection which is not enough feature to tell the difference between car or something else. What scientists did, they made the filter or kernel as a training parameter of the deep learning model which will be learned by the model after training. So the values will be initialized randomly and will train as a part of the model. After convolutional neural networks, we don't need to explicitly extract the features of the image. Once the features are extracted we just need to consider it as our normal machine learning scenario and put a classifier over it. Mostly we just add few dense layers after CNN for classification but we can even use SVM or any other classifier.

Hope this is clear to you if not feel free to ask in comments. Also, refer this article to visualize what CNN see which will make the concept more clear.

What is image classification?

Image classification is a task of great importance and a lot of real-world application. One of the best examples of this is in self-driving cars. When we drive a car we need to follow some traffic rules. But how can a self-driving car know what sign it is? This is where image classification comes into play. We can train a deep learning based model to predict from the image of sign what type of sign it is and can take actions accordingly.

So now let's start the fun part. You can also go to google colab to keep this interactive.

Dataset

We will be using a simple and famous dataset of cats and dogs and will try to classify cats and dogs in the image. The dataset is freely available on kaggle. We will be using google colab for training the model. I will write a separate blog on how to get started with google colab and some other tips about it.

Cat image from the dataset

The dataset consists of around 25000 images in one subfolder with filenames looking like ‘dog.6281.jpg’ and ‘cat.6361.jpg’. We have already dealt with data in keras earlier but unlike tabular data, image data is very big and we can’t load the complete dataset into RAM as we did earlier for tabular data. So we need to make a method to pick an image or a batch of images from the dataset and we will only load that batch into memory.

Dataloader

In keras, we get two predefined methods for data loading. One is flow from directory and other is flow from dataframe(pandas dataframe). So if we want to use predefined methods we need to either convert our dataset in subfolders of classes or we need to create a dataframe consisting of the filename and the classes. The second being easier we will use it.

import pandas as pd
filenames = os.listdir('./train')
categories = []
for filename in filenames:
  category = filename.split('.')[0]
  if category == 'dog':
    categories.append('dog')
  else:
    categories.append('cat')
        
    
df = pd.DataFrame({
    'filename': filenames,
    'category': categories
})

This code will create a dataframe consisting of filenames and a corresponding label. Now we need to make data generator and we can flow through the data frame. In datagenerator, we will just normalize image between 0 and 1. We will study about image augmentation techniques later.

from keras.preprocessing.image import ImageDataGenerator
#We need to normalize image
datagen = ImageDataGenerator(rescale=1./255)

We will also create a validation loader to validate the model. So we will divide the dataframe into two parts of train and validation.

from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(df,test_size = 0.2)

We are good to make train and valid generator.

traingenerator = datagen.flow_from_dataframe( train_df , './train',x_col = 'filename' , y_col = 'category', target_size = (224,224) ,class_mode='categorical', batch_size = 32)
testgenerator = datagen.flow_from_dataframe( test_df , './train',x_col = 'filename' , y_col = 'category', target_size = (224,224) ,class_mode='categorical', batch_size = 32)

Train generator is the method which will generate one batch of data. The arguments of this method are the data frame we created, the path to images, name of columns, final image size, batch size. We also need to specify the categorical mode.

Dataloader will always be in a different way, so practice it with different datasets and try to study the documentation of keras. In case of any doubt please feel free to contact.

Now we are done with the dataloader and we will create a model. We will also create a validation loader to validate the model.

Model

The model will be a CNN based model. We will keep it simple in this blog and study later how to improve model and accuracy.

from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Activation, BatchNormalization
model = Sequential()
model.add(Conv2D(32, (5,5), activation='relu', input_shape=(224, 224, 3)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (5,5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, (5,5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (5,5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(256, (5,5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(2, activation='softmax')) # 2 because we have cat #and dog classes
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

I have explained some part of it already. Only Conv2d and Maxpool2d are different here. I will explain their arguments now.

In Conv2d we need to feed the output number of channels which is the first argument. The second argument is kernel size which we have used (5,5) you can use different but we generally keep it odd in size. The reason for keeping the kernel size odd can be found here. We are familiar with the third argument which is the activation function of that layer.

In Maxpool2d we give only one argument, pool size.

Now its time to fit the model.

Training

Now we have created a model and good to train the model. Training model in keras is very easy and rather just one line of code. Unlike in the last blog where the data was just one numpy array, it is a data generator in this case. Keras gives a different function for this.

history = model.fit_generator(
    traingenerator, 
    epochs=20,
    validation_data = testgenerator,
    validation_steps= len(testgenerator),
    steps_per_epoch = len(traingenerator)
)

Training loss and accuracy plot

The code for plotting these graphs is the same

import matplotlib.pyplot as plt
#Loss
plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) 
plt.title('Model loss') 
plt.ylabel('Loss') 
plt.xlabel('Epoch') 
plt.legend(['Train', 'Test'], loc='upper left') 
plt.show()
#Accuracy

plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

Testing model

Once the model is trained we can test it on some images which I will take from google and we will see how it performs on some real data.

import numpy as np
def img_show(image):
  b,g,r = cv2.split(image)
  image = cv2.merge((r,g,b))
  plt.imshow(image)
  plt.show()
  return image
def test(model,image_path):
  img = cv2.imread(image_path)
  img = img_show(img)
  img = cv2.resize(img,(224,224))
  img = np.reshape(img,(1,224,224,3))
  img = img/255.0
  prediction = model.predict(img)
  prediction = np.argmax(prediction)
  labels = (traingenerator.class_indices)
  labels = dict((v,k) for k,v in labels.items())
  return labels[prediction]

Dont worry this might look difficult but it is not. I will tell you code line by line. In img_show function, I have just changed the image from BGR to RGB. Opencv reads the image in BGR format and matplotlib takes the image in RGB format so to show image we first need to convert it into RGB format. Now we will resize the image to the shape we used at the time of training. Now we can predict on a custom image using model.predict function. But we need to feed a batch of the image in this function rather than just a single image. So we reshape our image from 224,224,3 to 1,224,224,3 which means that batch size is 1. We also need to normalize the image that's why we divided image with 255.0. Now prediction will return the softmax output. As we know, softmax will return confidence score and the sum of predictions will be 1.

np.argmax will return the index of maximum prediction. Now we need to convert it back to the string label from the encoded label of this. Keras comes to save us here again. Keras internally creates a dictionary to convert labels from string to integer label. We can visualize this dictionary using class_indices function.

print(traingenerator.class_indices)
#printed output
{'cat': 0, 'dog': 1}

We just need to reverse this dictionary to return the string from integer predicted. The line before the return in test function will inverse label dictionary.

labels = (traingenerator.class_indices)
labels = dict((v,k) for k,v in labels.items())
print(labels)
#Printed output
{0: 'cat', 1: 'dog'}

At last, we just need to get the value of the predicted integer by the model.

The model is working fine and you can check it on google colab to run the code.

In the next blog, we will learn how to improve our model and get better results. We will compare our results in comparison to this model which will give us better insight.

Peace …

Machine Learning
Deep Learning
Image Classification
Cnn
Keras
Recommended from ReadMedium