
Using dropout layers
Finally, adding dropout neurons to layers is a technique that's widely used to regularize neural networks and prevent them from overfitting. Here, we, quite literally, drop out some neurons from our model at random. Why? Well, this results in a two-fold utility. Firstly, the contributions these neurons had for the activations of neurons further down our network are randomly ignored during a forward pass of data through our network. Also, any weight adjustments during the process of backpropagation are not applied to the neuron. While seemingly bizarre, there is good intuition behind this. Intuitively, neuron weights are adjusted at each backward pass to specialize a specific feature in your training data. But specialization breeds dependence. What often ends up happening is the surrounding neurons start relying on the specialization of a certain neuron in the vicinity, instead of doing some representational work themselves. This dependence pattern is often denoted as complex co-adaptation, a term that was coined by Artificial Intelligence (AI) researchers. One among them was Geoffrey Hinton, who was the original co-author of the backpropagation paper and is prominently referred to as the godfather of deep learning. Hinton playfully describes this behavior of complex coadaptation as conspiracies between neurons, stating that he was inspired by a fraud prevention system at his bank. This bank continuously rotated its employees so that whenever Hinton paid the bank a visit, he would always encounter a different person behind the desk.