Multi-layer feed-forward neural network
Historically, artificial neural networks have been largely identified by multi-layer feed-forward perceptrons, and so we will begin with a discussion of the primitive elements of the structure of such networks, how to train them, the problem of overfitting, and techniques to address it.
Inputs, neurons, activation function, and mathematical notation
A single neuron or perceptron is the same as the unit described in the Linear Regression topic in Chapter 2, Practical Approach to Real-World Supervised Learning. In this chapter, the data instance vector will be represented by x and has d dimensions, and each dimension can be represented as . The weights associated with each dimension are represented as a weight vector w that has d dimensions, and each dimension can be represented as
. Each neuron has an extra input b, known as the bias, associated with it.
Neuron pre-activation performs the linear transformation of inputs given by:

The activation function...