When we create a CNN layer, we have to define the kernel - the kernel is just the size of the filter. For example, in the image above, we have a 2 x 2 kernel, meaning that we pool a 2 x2 square of the pixels together to get our reduced image. We can increase or decrease the pooling size to change how big or small our reduced image will be.
The first thing we need to do is take our data and ‘clean it’ - extract the pixel values, and possible change the color, rotate the picture, or crop the sides to prevent overfitting. Next, we will build a model, using our pixel values as our x values as well as our max pooling/average pooling function. Finally, we train our model and test it’s accuracy.
CNNs and Computer Vision have great potential to bring AI into many different fields, from robotics to facial recognition, and, for all it’s potential, tends to be quite simple to make, requiring only a few simple steps to create a well built model. In this chapter, we will go over how to implement the basic steps outlined above, as well as tips and tricks to create a great model.