Conclusion

This project helped us understand the background of neural networks, and how they work under the hood. From training, we’ve learned what the optimal network architecture looks like, for training using either Gradient Descent or Genetic Algorithm. Each input argument, such as learning rate, batch size or a number of epochs, has an impact on the time spent training and the quality of the results. The main goal is to find the best proportion of those two factors.

Knowing that representability and trainability are two main attributes that describe the network, layer structure has a huge impact on the network’s performance. The network is “representable” if it can represent the problem with a certain level of complexity. The higher the number of layers, the more complex problems network can represent. The problem comes with us being able to train the network. Too many layers and a number of nodes in each layer can lead to slow convergence or high computational power demands.

The learning rate determines how big of a step is taken towards the global optimum. If the step is too big, the network might never converge because it would “jump over” the optimum. On the other hand, too small learning rate leads to slow convergence and long time training. So, the conclusion is that there is no “one size fits all”, but the user must experiment with different architectures to find the best solution for the problem.

The Gradient Descent algorithm has shown better results than the Genetic Algorithm. It takes less time to train the neural network, converges faster and requires significantly less computing power. While being slower, GAs are more suited for multi-criteria problems.

Thank you for reading!