Neural Network - Alexnet ImageClassifier Home : www.sharetechnote.com
Roughly there are two ways to utilize the nerual network. One is to develop the network on your own and the other one is to use various pretrained network. Matlab Machine Learning Toolbox(ML Toolbox) supports the interface functions that let you get access to various pretrained neural net. As of now (Dec 2019), I see around 16 different pretrained neural network that you can utilize using Matlab ML toolbox. See this page for the list of all the pretrained network you can use from Matlab ML toolbox.
This tutorial is to show you a simple example on how you can get access to those pretrained nework and utilize it to classifying your own data. The pretrained network that I am going to use in this tutorial is called Alexnet.
Since the network structure (especially the input layer structure) of those pretrained network differ from each other, you would not be able to copy this example blindly to use other pretrained network, but this example would give you a concrete understandings on overall flow of utilizing a pretrained network.
Step 1 : Getting Access to the pretrained Network
First Step is to connect to the Alexnet and store the network to a variable. It is as simple as follows.
alxNet = alexnet
You can print out overall structure of the network simply by printing out the variable.
SeriesNetwork with properties:
Layers: [25×1 nnet.cnn.layer.Layer]
You see that this network is a CNN type of network which is made up of 25 layers.
Step 2 : Finding the details of the network
Then you want to know of the details of each of the layers in the network. That can also be done by printing the layer property of the network variable as below.
alxLayers = alxNet.Layers
25x1 Layer array with layers:
1 'data' Image Input 227x227x3 images with 'zerocenter' normalization
2 'conv1' Convolution 96 11x11x3 convolutions with stride [4 4] and padding [0 0 0 0]
3 'relu1' ReLU ReLU
4 'norm1' Cross Channel Normalization cross channel normalization with 5 channels per element
5 'pool1' Max Pool 3x3 max pooling with stride [2 2] and padding [0 0 0 0]
6 'conv2' Grouped Convolution 2 groups of 128 5x5x48 convolutions with stride [1 1]
and padding [2 2 2 2]
7 'relu2' ReLU ReLU
8 'norm2' Cross Channel Normalization cross channel normalization with 5 channels per element
9 'pool2' Max Pooling 3x3 max pooling with stride [2 2] and padding [0 0 0 0]
10 'conv3' Convolution 384 3x3x256 convolutions with stride [1 1]
and padding [1 1 1 1]
11 'relu3' ReLU ReLU
12 'conv4' Grouped Convolution 2 groups of 192 3x3x192 convolutions with stride [1 1]
and padding [1 1 1 1]
13 'relu4' ReLU ReLU
14 'conv5' Grouped Convolution 2 groups of 128 3x3x192 convolutions with stride [1 1]
and padding [1 1 1 1]
15 'relu5' ReLU ReLU
16 'pool5' Max Pooling 3x3 max pooling with stride [2 2] and padding [0 0 0 0]
17 'fc6' Fully Connected 4096 fully connected layer
18 'relu6' ReLU ReLU
19 'drop6' Dropout 50% dropout
20 'fc7' Fully Connected 4096 fully connected layer
21 'relu7' ReLU ReLU
22 'drop7' Dropout 50% dropout
23 'fc8' Fully Connec 1000 fully connected layer
24 'prob' Softmax softmax
25 'output' Classification Out crossentropyex with 'tench' and 999 other classes
When it comes to utilizing a pretrained network, the most important thing is to understand all the details of the input layer (layer 1) of the network because you have to preprocess your data according to the requirement of the input layer.
You can print out the details of the input layer (layer 1) by printing out the first element of the variable that stores all the layer information in previous step.
alxInput = alxLayers(1)
ImageInputLayer with properties:
InputSize: [227 227 3]
Mean: [227×227×3 single]
In the print out, you see InputSize: [227 227 3]. This mean that this network (a CNN) requires an image with the size of (227 x 227) and with 3 color layer (i.e, RGB color). If you want to save only this inputSize, you can do it as shown below.
alxInputSize = alxInput.InputSize
227 227 3
In the same way, you can get the details of the output layer of the network by printing out the last element of the layer array as shown below.
alxOutput = alxLayers(end)
ClassificationOutputLayer with properties:
Classes: [1000×1 categorical]
Especially the most important information about the output layer would be the list of the all the category labels of the network. You can figure out all the list of the output label by printing out Class property of the output layer as shown below.
alxCategory = alxOutput.Classes
1000×1 categorical array
great white shark
Step 3 : Preprocessing an image to fit to the Input layer requirement of the network.
Now I want to load my own image file that I want to classify with the pretrained network (Alexnet).
First, load an image and store it to a variable as shown below. In order for you to try with your own data (image file in this case), put the image into a folder on your PC and modify the file path according to the location and file name in your PC.
imgfile = sprintf("%s\\temp\\Apple_02.png",pwd)
img = imread(imgfile);
Now I have the make it sure that this image file is in proper format (size and color layers) that fits to the input layer requirement of the alexnet. I hope you remember that the input dimension of the network is 227 x 227 x 3. The image that I am using for this tutorial is already RGB color, so I don't need to do any preprocessing for color layer matching. So I only need to change image size to fit for Alexnet by using imresize() function as shown below.
img = imresize(img,[227,227]);
Plottting the image for check.
Step 4 : Classify my image with the Alexnet
Now let's put my image into Alexnet and let it classify it. This can be done by the single line as shown below.
[imgClass,imgScores] = classify(alxNet,img);
And you get the result as shown below. imgClass variable stores the label that Alexnet came out with for the image that I put into the network.
imgScores variable returns the scores for the image file that is for each and every labels (i.e, 1000 labels for Alexnet).
1×1000 single row vector
Columns 1 through 13
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 ...
This is not the mandatory step, but let's look into a little bit further into the classification result by plotting the result as below.
In the first plot, I plot the scores of all of the labels of the output layer (1000 labels in Alexnet)
In the second plot, I would extract the labels that scores higher than 0.01.
imgHighScores = imgScores > 0.01;
Next Step :