This article explains the basics of multiclass image classification and how to perform image augmentation using Tensorflow and a Rock Paper Scissors dataset from Kaggle.
Abstract
This article provides an overview of multiclass image classification and image augmentation, a technique used to artificially expand the size of a training dataset by creating modified versions of images. The article covers the installation of Tensorflow and the use of a Rock Paper Scissors dataset from Kaggle to perform multiclass image classification. The article also provides code snippets and explanations for defining a CNN model, model compilation and callback function, generators, model fitting, visualizing model training, prediction, and label mapping. The article concludes with the full code being available on Kaggle and a recommendation to try out a cost-effective AI service.
Bullet points
Image augmentation is a technique used to artificially expand the size of a training dataset by creating modified versions of images
The article uses a Rock Paper Scissors dataset from Kaggle to perform multiclass image classification
The article covers the installation of Tensorflow and provides code snippets for defining a CNN model, model compilation and callback function, generators, model fitting, visualizing model training, prediction, and label mapping
The article concludes with the full code being available on Kaggle and a recommendation to try out a cost-effective AI service.
Multiclass Classification with Image Augmentation
This article explains the basics of multiclass image classification and how to perform image augmentation.
Image Augmentation, a solution to the problem of limited data. Image augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset. Image Augmentation encompasses a suite of techniques that enhance the size and quality of training images such that better Deep Learning models can be built using them.
>> import tensorflow as tf
>> tf.__version__
'2.3.0'
Now, we are going to use Rock Paper Scissors Dataset from Kaggle to perform multiclass image classification.
Let’s jump into it !!!
1. Dataset exploration
The dataset has three directories namely train, test and validation. Here, train and test have three classes of image and validation has a list of images to be tested.
The output is,
Train set --> ['paper', 'scissors', 'rock']
Test set --> ['paper', 'scissors', 'rock']
Validation set --> ['paper8.png', 'paper1.png', 'scissors-hires1.png']
2. Dataset Sample
Let’s display a random image of each class from the dataset.
So, the images are,
3. Defining the CNN model
This model comprises of five different types of layer,
Convolution Layer: This layer will extract important features from the image
Pooling Layer: This layer reduces the spatial volume of the input image after convolution by isolating the important features
Flatten Layer: Flattens the input into a single-dimensional array
Hidden Layer: Also called a dense layer, connects the network from a layer to another layer
Output Layer: It is the final layer consisting of neurons equals to the no.of classes
Here, we have three classes of the image, so, the output layer should have three neurons.
For this model, we use adam optimizer and categorical_crossentropy as the loss function.
The callback function here will stop the training of the model on epoch end when it reaches the accuracy >95%.
5. Generators
Training Generator with Image augmentation
Found2520 images belonging to 3 classes.
Validation Generator
Found372 images belonging to 3 classes.
6. Fitting the model
As we are using generators in place of model.fit we need to use model .fit_generator function