Let’s say you have a dataset of 100 elements. What would you prefer? the 100 are close to target with an offset of 0.1 each, or all hitting the target, but on a specific case, the error is 10?

In both cases you have:

Case A: 100*0.1 = 10

Case B: 99*0+10 = 10

In case A the model is reliable, 0.1 error is fine. In case B, the model is overfitting on 99/100 but can kill a guy on that extreme cases. We prefer to have the model overall reliable, than to have it getting an extreme error in one case. Now if we raise to square you get the errors:

Case A: 100*0.1*0.1 = 1

Case B: 99*0*0+10*10 = 100.

We can see here that case A is better than case B. Hope it’s clearer :)

Interested in artificial intelligence, machine learning, neural networks, data science, blockchain, technology, astronomy. Co-founder of Datathings, Luxembourg

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store