Artificial neural networks (NN for short) are practical, elegant, and mathematically fascinating models for machine learning. They are inspired by the central nervous systems of humans and animals – smaller processing units (neurons) are connected together to form a complex network that is capable of learning and adapting.
The idea of such neural networks is not new. McCulloch-Pitts (1943) described binary threshold neurons already back in 1940’s. Rosenblatt (1958) popularised the use of perceptrons, a specific type of neurons, as very flexible tools for performing a variety of tasks. The rise of neural networks was halted after Minsky and Papert (1969) published a book about the capabilities of perceptrons, and mathematically proved that they can’t really do very much. This result was quickly generalised to all neural networks, whereas it actually applied only to a specific type of perceptrons, leading to neural networks being disregarded as a viable machine learning method.
In recent years, however, the neural network has made an impressive comeback. Research in the area has become much more active, and neural networks have been found to be more than capable learners, breaking state-of-the-art results on a wide variety of tasks. This has been substantially helped by developments in computing hardware, allowing us to train very large complex networks in reasonable time. In order to learn more about neural networks, we must first understand the concept of vector space, and this is where we’ll start.