{{announcement.body}}
{{announcement.title}}

Convolutional Neural Network – How to Code Some of the Critical Steps

DZone 's Guide to

Convolutional Neural Network – How to Code Some of the Critical Steps

Convolutional Neural Network – How to code some of the critical steps

· AI Zone ·
Free Resource

While I was writing my custom program, I realized that some of the functions could be a little tricky to handle. I tried to make this article not just another article on how CNN works or the mathematics behind it, but more about some of the techniques on how data is structured/formatted while it moves from the convolution layers to maxpool, to the fully connected layer, and back again.

I will take an example to explain how the convolve, max pool, FC, and backpropagation datasets will look with details on some of the key functions

The steps that we would go through below (typical of a CNN) are as follows:

In the __init__ method, we need to initialize the

  1. Image Data
  2. Output labels
  3. Filter/Patch array (The filter will have to be initialized dynamically, which will need to be updated later while we adjust the weights and filters during the back-propagation)

The CNN function can have

  1. Multiple Convolution and Max Pool layers. Each convolution layer can have multiple kernels/arrays of filters that we can apply on the image input to that layer.
  2. The final output coming out of the final Max Pool layer will be flattened into a single-dimensional array and fed into the Fully Connected (Multi-Layer Neuron Network).
  3. The Multi-Layer Neuron Network can have N hidden layers with N neurons in each layer.
  4. The output of the last layer in the FC network can be fed into a SoftMax function to determine the probability of the image class.
  5. The back-propagation process will have the following in this order(to keep it simple, assuming that we have a single Convolution and Max Pool):
    • Error/Performance cost function( a function of the desired and actual).
    • Back propagate through the FC layer and calculate the delta error.
    • Update the weights of the FC layer with the delta error.
    • Back propagate through the max pool.
    • Prepare the dataset for the convolution layer (e.g changing the 3*3 which we derived in the forward path back to a 6*6).
    • Apply the activation/ReLU to the dataset.
    • Take this output and do a dot product with the Delta Network.
    • Take the above output and convolve over the input image. This is not going to be a simple dot product. The operation is similar to what we did in the forward path.
    • The output will be a dataset of the same dimension as the filter map. Update the filter map.
  6. In the end, we have an updated filter map and FC layer weight network.
  7. Repeat for N iterations.

Code of the Important Functions

Step 1: INIT Method

I am using hard-coded values(an example from a CNN tutorial) as the objective is to show the code of some of the critical functions

Python
 







Step 2: Creating the Feature Map Dataset (on the input image that will help to apply the filter)

Java
 







The Feature Map list will look like. We are traversing by 3*3 horizontally-vertically, moving 1 pixel/stride.

Feature Map List in applyFilter and size:

Plain Text
 







Step 3: Apply the Filter

This step is simple. The filter of 3*3 (e.g [[0,1,0],[0,1,0],[0,1,0]] ) is applied to the data above. The core of the logic is in these two lines

Python
 







The sample output of the above operation (multiplication and addition) will look like below. Each sublist dimension is a 6*6 matrix. Number of sub lists is 10, as we have 10 filter maps.

Plain Text
 







Kernels=10,360 elements in total

Step 4: Apply the ReLU Activation Function

This is also a simple step where you traverse the array/lists and set individual values to 0; if they are less than 0, set the same value (0 or >) based on the value.

Step 5: Max Pool Function

We saw in Step 3 that there are 360 elements. As part of the max pool, we will be trying to make the 6*6 matrix into a 3*3 matrix.

To make the max function work, we need two values to compare. We can use the max function to determine the max of each sub-list (highlighted in bold below):

Python
 







So we break the whole dataset from Step 3 into the dataset as below and apply the max function taking a 2*2 matrix at a time:

Python
 







Post the max pool operation. The dataset will look like the one below. The 3*3 matrix is the final output from the max pool, and the same is seen repeated 10 times, as there are 10 filter maps, and the operation is run for all the filter patches

('Max Pool Map final List', [[[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0],[0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]], [[0.0, 3.0, 0.0], [0.0, 3.0, 0.0], [0.0, 3.0, 0.0]]])


Step 6: Flatten the List

We take the output of Step 5 and create a one-dimensional array of the elements. The length of this array will be the input dataset (number of features) to the fully connected layer. The details of we can code an ANN is given in my other blog (https://medium.com/@chakrru2/forward-and-back-propagation-programming-technique-steps-to-train-an-artificial-neural-net-1968d3e0e040)

The flattenList function will look like

Step 7: Back Propagate Max Pool:

After we have calculated the delta error from the fully connected ANN, we will have to move backward through the max pool into the convolution layer. When we move back the max pool box, we will have to do the following 2 things:

  • During the forward propagation, we might have stored the input to all the max layers in a list. We will have to take each input and set the values (to zero for values <=0 and to 1 for values greater than 0). Why 1, as d/dx of x is 1.

    We take a 2*2 again and find the max, check if 0 or greater and then set to 0 or 1.

  • Max Pool Layer Input before back propagation [[[[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0],[0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]], [[0.0, 0.0], [0.0, 0.0]], [[3.0, 3.0], [3.0, 3.0]], [[0.0, 0.0], [0.0, 0.0]]]],360 elements


  • Then, we will have to expand the 3*3 into 6*6 again, so that the convolution layer dimensions match with the max layer dimension

The function to do this operation can look like this. Of course, you can write a better-looking function than this


Python


The dataset of the 6*6 will look like below. We will have 10 sub-lists of 6*6 matrix. (There are 10 such sub-lists as there are 10 filter patches).

Plain Text
 







Step 8: Back Propagate Convolution:

With the above input, we will have to do the following

  • Put the dataset through activation/RELU function (simple element by element and changing to 0 and 1).
  • At this point, we will have to take the delta error of the Fully connected layer and do the multiplication with the elements.

In this example, the output will look like below. 6*6 repeated 10 times for 10 filter maps

Conv Layer after activation step during backpropagation and Error Calc:

Plain Text
 







Step 9: Convolve With the Input Image:

If Output = Input(X) (convolve) filter_array

d/dx filter_array output wr.r.t Output will be the convolve operation of Output and Input(Image)

For this, we

  • Take the function of Step 2 over the input image, the output dimension as (8–6)+1=3[the above matrix is 6*6 and the input image is a 8*8], the filter dimension as 6 (as the above matrix is a 6*6).
  • The dataset post the operation will look like below. (6*6 and there will be 9 6*6 matrix)

Feature Map dataset and size for conv backprop 

Plain Text
 







We convolve the above dataset with the dataset(from Step 8), basically applying the dataset from Step 8 as Filter over the above dataset (with the core logic around element wise multiplication and addition)

  • After the operation, we get a dataset as below(a 3*3 matrix). There will be 10 such sub-lists as the number of filter patches is 10. Since the test data was perfect, all the delta changes are all zero and we don’t need a second iteration

filtered_feature_map_backprop and size 

Plain Text
 







Step 10: Update the Filter Maps

This step is a very simple step where we take the filter Map (dimension=3*3*10) and subtract the above dataset (dimension=3*3*10)

Python
 







The filterMap dataset after the update is the same dataset that we initialized with (the test data was perfect and hence no change)

('Filter Map after update', [[[[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 1.0, 0.0], [0.0, 1.0, 0.0], [0.0, 1.0, 0.0]]]])

In the actual scenario, we can run for some iterations until we get to the minimum point of the cost function with a strong accuracy percentage

Finally in the predict function, once we have the updated Filter Map and the FC weights, we will have to perform the following steps

  • Run convolution with updated filter patches.
  • Run maxpool.
  • Flatten list.
  • Forward propagate with updated neuron weights.
  • Predict through SoftMax in the final layer.
  • Return predicted value.
Topics:
CNN, ai, machine learnin, python, tutorial

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}