CNNs from different viewpoints

Prerequisite: Basic neural networks

The theme of this post

The image

The filter

Since the filter fits in the image four times, we have four results

Here’s how we applied the filter to each section of the image to yield each result

The equation view

Notice that the bias term, b, is the same for each section of the image. You can consider the bias as part of the filter, just like the weights (α, β, γ, δ) are part of the filter.

The compact equation view

The neural network view

The matrix multiplication view

The matrix above is a weight matrix, just like the ones from traditional neural networks. However, this weight matrix has two special properties:

The zeros shown in gray are untrainable. This means that they’ll stay zero throughout the optimization process. Some of the weights are equal, and while they are trainable (i.e. changeable), they must remain equal. These are called “shared weights”.

The zeros correspond to the pixels that the filter didn’t touch. Each row of the weight matrix corresponds to one application of the filter.

The dense neural network view

The gray connections correspond to the untrainable zeros. This graph is the same as the previous graph, except that it shows the untrainable zeros. This view helped me see the connection between traditional neural networks and CNNs.

A familiar diagram